SUPPORT THE WORK

GetWiki

categorial grammar

ARTICLE SUBJECTS
aesthetics  →
being  →
complexity  →
database  →
enterprise  →
ethics  →
fiction  →
history  →
internet  →
knowledge  →
language  →
licensing  →
linux  →
logic  →
method  →
news  →
perception  →
philosophy  →
policy  →
purpose  →
religion  →
science  →
sociology  →
software  →
truth  →
unix  →
wiki  →
ARTICLE TYPES
essay  →
feed  →
help  →
system  →
wiki  →
ARTICLE ORIGINS
critical  →
discussion  →
forked  →
imported  →
original  →
categorial grammar
[ temporary import ]
please note:
- the content below is remote from Wikipedia
- it has been imported raw for GetWiki
Categorial grammar is a term used for a family of formalisms in natural language syntax motivated by the principle of compositionality and organized according to the view that syntactic constituents should generally combine as functions or according to a function-argument relationship. Most versions of categorial grammar analyze sentence structure in terms of constituencies (as opposed to dependencies) and are therefore phrase structure grammars (as opposed to dependency grammars).

Basics

A categorial grammar consists of two parts: a lexicon, which assigns a set of types (also called categories) to each basic symbol, and some type inference rules, which determine how the type of a string of symbols follows from the types of the constituent symbols. It has the advantage that the type inference rules can be fixed once and for all, so that the specification of a particular language grammar is entirely determined by the lexicon.A categorial grammar shares some features with the simply typed lambda calculus.Whereas the lambda calculus has only one function type A rightarrow B,a categorial grammar typically has two function types, one type which is applied on the left,and one on the right. For example, a simple categorial grammar might have two function types B/A,! and Abackslash B.The first, B/A,!, is the type of a phrase that results in a phrase of typeB,! when followed (on the right) by a phrase of type A,!.The second, Abackslash B,!, is the type of a phrase that resultsin a phrase of type B,! when preceded (on the left) by a phrase of type A,!. The notation is based upon algebra. A fraction when multiplied by (i.e. concatenated with) its denominator yields its numerator. As concatenation is not commutative, it makes a difference whether the denominator occurs to the left or right. The concatenation must be on the same side as the denominator for it to cancel out.The first and simplest kind of categorial grammar is called a basic categorial grammar, or sometimes an AB-grammar (after Ajdukiewicz and Bar-Hillel).Given a set of primitive types text{Prim},!, let text{Tp}(text{Prim}),! be the set of types constructed from primitive types. In the basic case, this is the least set such that text{Prim}subseteq text{Tp}(text{Prim})and if X, Yin text{Tp}(text{Prim})then (X/Y), (Ybackslash X) in text{Tp}(text{Prim}).Think of these as purely formal expressions freely generated from the primitive types; any semantics will be added later. Some authors assume a fixed infinite set of primitive types used by all grammars, but by making the primitive types part of the grammar, the whole construction is kept finite.A basic categorial grammar is a tuple (Sigma, text{Prim}, S, triangleleft)where Sigma,! is a finite set of symbols,text{Prim},! is a finite set of primitive types, and S in text{Tp}(text{Prim}).The relation triangleleft is the lexicon, which relates types to symbols (triangleleft) subseteq text{Tp}(text{Prim}) times Sigma.Since the lexicon is finite, it can be specified by listing a set of pairs like TYPEtrianglelefttext{symbol}.Such a grammar for English might have three basic types (N,NP, text{ and } S),!, assigning count nouns the type N,!, complete noun phrases the typeNP,!, and sentences the type S,!.Then an adjective could have the type N/N,!, because if it is followed by a noun then the whole phrase is a noun. Similarly, a determiner has the type NP/N,!,because it forms a complete noun phrase when followed by a noun.Intransitive verbs have the type NPbackslash S, and transitive verbs the type (NPbackslash S)/NP.Then a string of words is a sentence if it has overall type S,!.For example, take the string "the bad boy made that mess". Now "the" and "that" are determiners, "boy" and "mess" are nouns, "bad" is an adjective, and "made" is a transitive verb, so the lexicon is{NP/Ntrianglelefttext{the},NP/Ntrianglelefttext{that},Ntrianglelefttext{boy},Ntrianglelefttext{mess},N/Ntrianglelefttext{bad},(NPbackslash S)/NPtrianglelefttext{made}}.and the sequence of types in the string is{text{the}atop {NP/N,}}{text{bad}atop {N/N,}}{text{boy}atop {N,}}{text{made}atop {(NPbackslash S)/NP,}}{text{that}atop {NP/N,}}{text{mess}atop {N}}now find functions and appropriate arguments and reduce them according to the two inference rules
Xleftarrow X/Y,; Y and
Xleftarrow Y,; Ybackslash X:
.qquad NP/N,; N/N,; N,; (NPbackslash S)/NP,; underbrace{NP/N,; N}.qquad NP/N,; N/N,; N,; underbrace{(NPbackslash S)/NP, quad NP}.qquad NP/N,; underbrace{N/N,; N}, qquad (NPbackslash S).qquad underbrace{NP/N,; quad N},; qquad (NPbackslash S).qquad qquadunderbrace{NP,; qquad (NPbackslash S)}.qquad qquadqquadquad;;; SThe fact that the result is S,! means that the string is a sentence, while the sequence of reductions shows that it must be parsed as ((the (bad boy)) (made (that mess))).Categorial grammars of this form (having only function application rules) are equivalent in generative capacity to context-free grammars and are thus often considered inadequate for theories of natural language syntax. Unlike CFGs, categorial grammars are lexicalized, meaning that only a small number of (mostly language-independent) rules are employed, and all other syntactic phenomena derive from the lexical entries of specific words.Another appealing aspect of categorial grammars is that it is often easy to assign them a compositional semantics, by first assigning interpretation types to all the basic categories, and then associating all the derived categories with appropriate function types. The interpretation of any constituent is then simply the value of a function at an argument. With some modifications to handle intensionality and quantification, this approach can be used to cover a wide variety of semantic phenomena.

Lambek calculus

A Lambek grammar is an elaboration of this idea that has aconcatenation operator for types, and several other inference rules.Pentus has shown that these still have the generative capacity ofcontext-free grammars.For the Lambek calculus, there is a type concatenationoperator star,!, sothat text{Prim}subseteq text{Tp}(text{Prim})and if X, Yin text{Tp}(text{Prim})then (X/Y), (Xbackslash Y), (Xstar Y)in text{Tp}(text{Prim}).The Lambek calculus consists of several deduction rules, which specifyhow type inclusion assertions can be derived. In the followingrules, upper case roman letters stand for types, upper case Greekletters stand for sequences of types. A sequent of the form
X leftarrow Gamma
can be read: a string is of type X,! if it consists of the concatenationof strings of each of the types in Gamma,!. If a type isinterpreted as a set of strings, then theleftarrow may be interpreted as supseteq,!,that is, "includes as a subset". A horizontal line means that the inclusion above the lineimplies the one below the line.The process is begun by the Axiom rule, which has no antecedents andjust says that any type includes itself.(Axiom)quad{{}over X leftarrow X}The Cut rule says that inclusions can be composed.(Cut) quad{Z leftarrow Delta X Delta' qquad X leftarrow Gamma
over
Z leftarrow Delta Gamma Delta'}
The other rules come in pairs, one pair for each type constructionoperator, each pair consisting of one rule for the operator in thetarget, one in the source, of the arrow.The name of a rule consists of the operator and an arrow, with theoperator on the side of the arrow on which it occurs in the conclusion.{| class="wikitable" border="1" cellpadding="5"!Target!Source
|(backslash leftarrow) quad{Yleftarrow X Gamma
over
Xbackslash YleftarrowGamma}
|(leftarrow backslash) quad{Z leftarrow Delta Y Delta' qquad XleftarrowGamma
over
Z leftarrow Delta Gamma(Xbackslash Y) Delta'}
|(/leftarrow) quad{Yleftarrow Gamma X
over
Y/XleftarrowGamma}|(leftarrow/) quad{Zleftarrow Delta Y Delta' qquad XleftarrowGamma
over
Zleftarrow Delta (Y/X)Gamma Delta'}
|(starleftarrow) quad {Xleftarrow Gamma qquad Y leftarrow Gamma'
over
X star Y leftarrow GammaGamma'}|(leftarrowstar) quad {Zleftarrow Delta X Y Delta'
over
Zleftarrow Delta (X star Y) Delta'}
For an example, here is a derivation of "type raising", which says that(B/A)backslash B leftarrow A. The names of rules and the substitutions used are to the right.
dfrac {dfrac{}{B leftarrow B} qquad dfrac{}{A leftarrow A} }
{dfrac {B leftarrow (B/A), ;; A}
{(B/A)backslash B leftarrow A} }
qquadbegin{matrix}
mbox{(Axioms)}qquadqquadqquadqquadqquadqquadqquadqquadqquad{ }
{(leftarrow/),,[Z=Y=B,X=A,Gamma=(A),Delta=Delta'=()]}
{(backslashleftarrow),,[Y=B,X=(B/A),Gamma=(A)]}qquadqquadqquad{ }
end{matrix}

Relation to context-free grammars

Recall that a context-free grammar is a 4-tuple:G = (V,, Sigma,, ::=,, S)where1. V, is a finite set of non-terminals or variables.2. Sigma, is a finite set of terminal symbols.3. ::=, is a finite set of production rules, that is, a finite relation(::=)subseteq V times (V cup Sigma)^*.4. S, is the start variable.From the point of view of categorial grammars, a context-free grammarcan be seen as a calculus with a set of special purpose axioms foreach language, but with no type construction operators and noinference rules except Cut.Specifically, given a context-free grammar as above, define a categorialgrammar (text{Prim},, Sigma,, triangleleft,, S)where text{Prim}=VcupSigma,and text{Tp}(text{Prim})=text{Prim},!. Let there be an axiom{x leftarrow x} for every symbol x in VcupSigma,an axiom {X leftarrow Gamma} for every production rule X ::= Gamma,!,a lexicon entry {s triangleleft s} for every terminal symbol s in Sigma,and Cut for the only rule.This categorial grammar generates the same language as the given CFG.Of course, this is not a basic categorial grammar, since it hasspecial axioms that depend upon the language; i.e. it is not lexicalized.Also, it makes no use at all of non-primitive types.To show that any context-free language can be generated bya basic categorial grammar, recall that any context-free language can be generated by a context-free grammarin Greibach normal form.The grammar is in Greibach normal form if every production rule isof the form
A ::= s A_0 ldots A_{N-1},
where capital letters are variables, s in Sigma,and Nge 0,that is, the right side of the production is a single terminal symbolfollowed by zero or more (non-terminal) variables.Now given a CFG in Greibach normal form,define a basic categorial grammar with a primitive typefor each non-terminal variabletext{Prim}=V,!,and with an entry in the lexicon
A/A_{N-1}/ ldots /A_0 triangleleft s ,
for each production rule
A ::= s A_0 ldots A_{N-1}.
It is fairly easy to see that this basic categorial grammargenerates the same language as the original CFG.Note that the lexicon of this grammar will generallyassign multiple types to each symbol.The same construction works for Lambek grammars, sincethey are an extension of basic categorial grammars.It is necessary to verify that the extra inference rulesdo not change the generated language. This can be doneand shows that every context-free language is generatedby some Lambek grammar.To show the converse, that every language generated by aLambek grammar is context-free, is much more difficult.It was an open problem for nearly thirty years, fromthe early 1960s until about 1991 when it was provenby Pentus.The basic idea is, given a Lambek grammar,(text{Prim},, Sigma,, triangleleft,, S)construct a context-free grammar(V,, Sigma,, ::=,, S)with the same set of terminal symbols, thesame start symbol, with variables some (not all) typesVsubset text{Tp}(text{Prim}),!,and with a production ruleT::=text{s},!for each entryTtrianglelefttext{s}in the lexicon,and production rules T::=Gamma,!for certain sequents TleftarrowGammathat are derivable in the Lambek calculus.Of course, there are infinitely many typesand infinitely many derivable sequents, so inorder to make a finite grammar it is necessaryput a bound on the size of the types and sequentsthat are needed. The heart of Pentus's proofis to show that there is such a finite bound.

Notation

The notation in this field is not standardized. The notations used informal language theory, logic, category theory, and linguistics, conflictwith each other. In logic, arrows point to the more general from the more particular,that is, to the conclusion from the hypotheses. In this article,this convention is followed, i.e. the target of the arrow is the more general (inclusive) type.In logic, arrows usually point left to right. In this article this convention isreversed for consistency with the notation of context-free grammars, where thesingle non-terminal symbol is always on the left. We use the symbol ::=in a production rule as in Backus-Naur form. Some authors use an arrow, whichunfortunately may point in either direction, depending on whether the grammar isthought of as generating or recognizing the language.Some authors on categorial grammars write Bbackslash A instead ofAbackslash B. The convention used here follows Lambek and algebra.

Historical notes

The basic ideas of categorial grammar date from work by Kazimierz Ajdukiewicz (in 1935) and Yehoshua Bar-Hillel (in 1953). In 1958, Joachim Lambek introduced a syntactic calculus that formalized the function type constructors along with various rules for the combination of functions. This calculus is a forerunner oflinear logic in that it is a substructural logic. Montague grammar uses an ad hoc syntactic system for English that is based on the principles of categorial grammar. Although Montague's work is sometimes regarded as syntactically uninteresting, it helped to bolster interest in categorial grammar by associating it with a highly successful formal treatment of natural language semantics. More recent work in categorial grammar has focused on the improvement of syntactic coverage. One formalism which has received considerable attention in recent years is Steedman and Szabolcsi's combinatory categorial grammar which builds on combinatory logic invented by Moses Schönfinkel and Haskell Curry.There are a number of related formalisms of this kind in linguistics, such as type logical grammar and abstract categorial grammar.

Some definitions

Derivation: A derivation is a binary tree that encodes a proof.
Parse tree: A parse tree displays a derivation, showing the syntactic structure of a sentence.
Functor and Argument: In a right (left) function application, the node of the type AB (B/A) is called the functor, and the node of the type A is called an argument.
Functor-argument structure{{what| where's the definition?|date=July 2015}}

Refinements of categorial grammar

A variety of changes to categorial grammar have been proposed to improve syntactic coverage. Some of the most common ones are listed below.

Features and subcategories

Most systems of categorial grammar subdivide categories. The most common way to do this is by tagging them with features, such as person, gender, number, and tense. Sometimes only atomic categories are tagged in this way. In Montague grammar, it is traditional to subdivide function categories using a multiple slash convention, so A/B and A//B would be two distinct categories of left-applying functions, that took the same arguments but could be distinguished between by other functions taking them as arguments.

Function composition

Rules of function composition are included in many categorial grammars. An example of such a rule would be one that allowed the concatenation of a constituent of type A/B with one of type B/C to produce a new constituent of type A/C. The semantics of such a rule would simply involve the composition of the functions involved. Function composition is important in categorial accounts of conjunction and extraction, especially as they relate to phenomena like right node raising. The introduction of function composition into a categorial grammar leads to many kinds of derivational ambiguity that are vacuous in the sense that they do not correspond to semantic ambiguities.

Conjunction

Many categorial grammars include a typical conjunction rule, of the general form X CONJ X → X, where X is a category. Conjunction can generally be applied to nonstandard constituents resulting from type raising or function composition..

Discontinuity

The grammar is extended to handle linguistic phenomena such as discontinuous idioms, gapping and extraction.

See also

References

  • {{citation |last1=Curry|first1=Haskell B.|author1-link=Haskell Curry |first2=Richard |last2= Feys|year=1958 | title= Combinatory Logic |volume= 1 |publisher=North-Holland}}
  • {{citation |last1=Jacobson|first1= Pauline|author1-link=Pauline Jacobson |title=Towards a variable-free semantics. |journal=Linguistics and Philosophy|volume=22|year=1999 |pages=117–184}}
  • {{citation |last1=Lambek |first1=Joachim|author1-link=Joachim Lambek|year=1958 |title=The mathematics of sentence structure |journal=Amer. Math. Monthly|volume= 65 |pages=154–170}}
  • {{citation |last1=Pentus |first1= Mati |year=1997 |title=Lambek Calculus and Formal Grammars| publisher= Amer. Math. Soc. Transl.}}
  • {{citation |last1=Steedman|first1=Mark |author1-link=Mark Steedman|year=1987 |title=Combinatory grammars and parasitic gaps |journal=Natural Language and Linguistic Theory |volume=5|pages=403–439|doi=10.1007/bf00134555}}
  • {{citation |last1=Steedman|first1= Mark |author1-link=Mark Steedman|year=1996 |title=Surface Structure and Interpretation|publisher=The MIT Press}}
  • {{citation |last1=Steedman|first1=Mark |author1-link=Mark Steedman |year=2000 | title=The Syntactic Process |publisher=The MIT Press}}
  • BOOK, Szabolcsi, Anna, 1989, Bound variables in syntax (are there any?), Semantics and Contextual Expression, Bartsch, van Benthem, van Emde Boas, Foris, 294–318,
  • BOOK, Szabolcsi, Anna, 1992, Combinatory grammar and projection from the lexicon, Lexical Matters, CSLI Lecture Notes, 24, Sag, Szabolcsi, Stanford, CSLI Publications, 241–269,
  • {{citation |last1=Szabolcsi |first1=Anna |year=2003 |chapter=Binding on the fly: Cross-sentential anaphora in variable-free semantics|title=Resource Sensitivity in Binding and Anaphora|editor1= Kruijff |editor2=Oehrle |publisher=Kluwer |pages=215–229|doi=10.1007/978-94-010-0037-6_8}}
  • {{citation |last1=Morril |first1=Glyn |year=1995 |title=Discontinuity in categorial grammar |journal=Linguistics and Philosophy |publisher=Springer |pages=175-219|doi=10.1007/bf00985216}}

Further reading

  • Michael Moortgat, Categorial Type Logics, Chapter 2 in J. van Benthem and A. ter Meulen (eds.) Handbook of Logic and Language. Elsevier, 1997, {{ISBN|0-262-22053-9}}
  • Wojciech Buszkowski, Mathematical linguistics and proof theory, Chapter 12 in J. van Benthem and A. ter Meulen (eds.) Handbook of Logic and Language. Elsevier, 1997, {{ISBN|0-262-22053-9}}
  • BOOK, Gerhard Jäger, Anaphora and Type Logical Grammar, 2005, Springer, 978-1-4020-3904-1,
  • BOOK, Glyn Morrill, Categorial Grammar: Logical Syntax, Semantics, and Processing, 2010, Oxford University Press, 978-0-19-958986-9,
  • BOOK, Richard Moot, Christian Retore, The Logic of Categorial Grammars: A Deductive Account of Natural Language Syntax and Semantics, 2012, Springer Verlag, 978-3-642-31554-1,

External links



- content above as imported from Wikipedia
- "categorial grammar" does not exist on GetWiki (yet)
- time: 7:44am EST - Sun, Dec 16 2018
[ this remote article is provided by Wikipedia ]
LATEST EDITS [ see all ]
GETWIKI 09 MAY 2016
GETWIKI 18 OCT 2015
M.R.M. Parrott
Biographies
GETWIKI 20 AUG 2014
GETWIKI 19 AUG 2014
GETWIKI 18 AUG 2014
Wikinfo
Culture
CONNECT