:''“
Syntactic” redirects here. For another meaning of the adjective, see
Syntaxis''
In
linguistics, 'syntax' (from
Ancient Greek συν- ''syn-'', “together”, and τάξις ''táxis'', “arrangement”) is the study of the rules that govern the structure of sentences, and which determine their relative grammaticality. The term ''syntax'' can also be used to refer to these rules themselves, as in “the syntax of a language”. Modern research in syntax attempts to describe languages in terms of such rules, and, for many practitioners, to find
general rules that apply to all languages. Since the field of syntax attempts to explain grammaticality judgments, and not provide them, it is unconcerned with
linguistic prescription.
Though all theories of syntax take human as their object of study, there are some significant differences in outlook.
Chomskyan linguists see syntax as a branch of psychology, since they conceive syntax as the study of linguistic knowledge. Others (e.g.
Gerald Gazdar) take a more
Platonistic view, regarding syntax as the study of an abstract formal system.
[1]
Early history
Works on grammar were of course being written long before modern syntax came about; the ''Aṣṭādhyāyī'' of
Pāṇini is often cited as an example of a pre-modern work that approaches the sophistication of a modern syntactic theory.
[1] In the West, the school of thought that came to be known as ‘traditional grammar’ began with the work of
Dionysius Thrax.
For centuries, work in syntax was dominated by a framework known as ''grammaire générale'', first expounded in 1660 by
Antoine Arnauld in a book of the same title. This system took as its basic premise the assumption that language is a direct reflection of thought processes, and that hence there is a single most natural way to express a thought (which, coincidentally, was exactly the way it was expressed in French).
However, in the 19 century, with the development of
historical-comparative linguistics, linguists began to realize the sheer diversity of human language, and to question fundamental assumptions about the relation between language and logic. It became apparent that there was no such thing as a most natural way to express a thought, and logic could no longer be relied upon as a base for studying the structure of language.
The central role of syntax within theoretical linguistics became clear only in the last century which could reasonably called the "century of syntactic theory" as far as linguistics is concerned. For a detailed and critical survey of the history of syntax in the last two centuries see the monumental work by Graffi 2001.
General grammar (''Grammaire générale'')
Main articles: Port-Royal grammar
The Port-Royal grammar modelled the study of syntax on that of logic (indeed, large parts of the
Port-Royal Logic were copied or adapted from the ''Grammaire générale''
[2]). Syntactic categories were identified with logical ones, and all sentences were analysed into the form "Subject-Copula-Predicate". Initially, this view was adopted even by the early comparative linguists (e.g.,
Bopp),
Modern theories
There are two features shared by most theories of formal syntax. First, they hierarchically group subunits into constituent units (usually referred to as
phrases). Second, they provide a system of rules to explain why certain utterances seem more acceptable or grammatical than others. Most formal theories of syntax also offer explanations of the systematic relationships between syntax and
semantics, in other words, between form and meaning.
Generative grammar and its descendants
Main articles: Generative grammar

Phrase structure tree
In the framework of
transformational-generative grammar (of which ''
government and binding theory'' and ''minimalism'' are recent developments), the structure of a
sentence is represented by ''phrase structure trees'', otherwise known as ''phrase markers'' or ''tree diagrams''. Such trees provide information about the sentences they represent by showing the hierarchical relations between their component parts.
Other theories of formal syntax
There are various theories for designing the best grammars such that by systematic application of the rules, one can arrive at every phrase marker in a language and hence every sentence in the language. The most common are
Phrase structure grammars, preferred by
Noam Chomsky's MIT school of linguistics, and
ID/LP grammars, the latter of which some argue has an explanatory advantage (especially those in opposition to the MIT school of linguistics, such as
Ivan Sag and
Geoffrey Pullum.)
Dependency grammar is a class of syntactic theories separate from generative grammar in which structure is determined by the relation between a word (a head) and its dependents. One difference from phrase structure grammar is that dependency grammar does not have phrasal categories.
Algebraic syntax is a type of dependency grammar.
A modern approach to combining accurate descriptions of the grammatical patterns of language with their function in context is that of
systemic functional grammar, an approach originally developed by Michael A.K. Halliday in the 1960s and now pursued actively on all continents. Systemic-functional grammar is related both to feature-based approaches such as Head-driven phrase structure grammar and to the older functional traditions of European schools of linguistics such as British Contextualism and the Prague School.
Tree-adjoining grammar is a grammar formalism with interesting mathematical properties which has sometimes been used as the basis for the syntactic description of natural language. In monotonic and monostratal frameworks, variants of
unification grammar are often preferred formalisms.
With the publication of Gold's Theorem
[3] 1967 it was claimed that grammars for natural languages governed by deterministic rules could not be learned based on positive instances alone. This was part of the argument from the
poverty of stimulus, first presented in 1980
[4]. This led to the
nativist view, that a form of grammar (including a complete conceptual lexicon in certain versions) were hardwired from birth.
A grammar is a description of the syntax of a language. Theoretical models rarely consider the language in use, as revealed by
corpus linguistics, but focus on a mental language or
i-language as its "proper" object of study. In contrast, the "empirically responsible"
[5] approach to syntax seeks to construct grammars that will explain language in use.
A key class of grammars in the latter tradition are the
stochastic context-free grammars.
A problem faced in any formal syntax is that often more than one production rule may apply to a structure, thus resulting in a conflict. The greater the coverage, the higher this conflict, and all grammarians (starting with
Panini) have spent considerable effort devising a prioritization for the rules, which usually turn out to be defeasible. Another difficulty is overgeneration, where unlicensed structures are also generated. Probabilistic grammars circumvent these problems by using the frequency of various productions to order them, resulting in a "most likely" (winner-take-all) interpretation, which by definition, is defeasible given additional data. As usage patterns are altered in
diachronic shifts, these probabilistic rules can be re-learned, thus upgrading the grammar.
One may construct a probabilistic grammar from a traditional formal syntax by assigning each non-terminal a probability taken from some distribution, to be eventually estimated from usage data. On most samples of broad language, probabilistic grammars that tune these probabilities from data typically outperform hand-crafted grammars (although some rule-based grammars are now approaching the accuracies of PCFG).
Recently, probabilistic grammars appear to have gained some cognitive plausibility. It is well known that there are degrees of difficulty in accessing different syntactic structures (e.g. the
Accessibility Hierarchy for
relative clauses). Probabilistic versions of
minimalist grammars have been used to compute information-theoretic
entropy values which appear to correlate well with psycholinguistic data on understandability and production difficulty.
[6]
Statistical grammars are not subject to Gold's theorem since the learning is incremental.
See also
Syntactic terms
Notes
1. Indo-European Language and Culture: An Introduction, , Benjamin W., Fortson IV, Blackwell, 2004,
2. La logique, , Antoine, Arnauld, G. Desprez, 1683,
3. Gold, E. (1967). Language identification in the limit. Information and Control 10, 447-474.
4. Chomsky, N. (1980). Rules and representations. Oxford: Basil Blackwell.
5.
Philosophy in the Flesh: The embodied mind and its challenge to Western thought. Part IV., George Lakoff and Mark Johnson, , , Basic Books., 1999,
6.
Uncertainty About the Rest of the Sentence, John Hale, , , Cognitive Science, 2006
References
★
Concise Encyclopedia of Syntactic Theories, , Keith, Brown, Elsevier Science, 1996,
★
Syntax, , Robert, Freidin, Routledge, 2006,
★
200 Years of Syntax. A Critical Survey, , Giorgio, Graffi, Benjamins, 2001,
External links
★
The syntax of natural language (Beatrice Santorini & Anthony Kroch, University of Pennsylvania)
★ Various syntactic constructs used in
computer programming languages