Human Sentence Processing Doug Arnold University of Essex [email protected]
Salient facts about the Human Sentence Processor include the following. Human Sentence processing is: • • • • • • • •
Fast, easy, and mandatory for everyone: a reflex ; Robust (copes with fragmentary and ungrammatical input); Compositional (i.e. not list access); Incremental (at least word by word), implying ‘connectivity’; ‘Strategic’: uses many redundant information sources (constraint resolution); Deterministic: there appears to be always one favoured interpretation; Copes with ambiguity (local and global) — hence non-deterministic: able to re-analyze; Incomplete wrt the grammar: – center embedding (but no problem with left and right embedding) (1) The rat the cat the baby stroked chased ate the cheese. (2) a. The rat stole the cheese. b. The rat [ (that) the cat chased ] stole the cheese. c. the cat [ (that) the baby stroked ] d. The rat [ (that) the cat [ (that) the baby stroked] chased ] stole the cheese. (3) a. This is the baby that stroked the cat that chased the rat that stole the cheese. b. John’s brother’s father’s uncle . . . – ‘garden path’ sentences (4) The horse raced past the barn fell.
It has been suggested that the HSP uses a number of different strategies, for example: • Minimal Attachment (MA) • Late Closure (LC) (Lexical preferences are also important, Ford (1982)) Late Closure (LC, aka Right Association) attach new material to the current constituent (i.e. lower down the tree, rather than higher). (5) a. Sam said that Kim left yesterday. b. Sam said [ that Kim left yesterday ]. c. Sam said [ that Kim left ] yesterday.
S !aa ! a ! NP
VP !aa ! a ! V
HH S , , l l NP VP
ADVP "b b " yesterday
(6) a. Sam said that Kim saw Sandy on the bus. b. Sam said that Kim left in a funny voice. Minimal Attachment
(MA) prefer analyses that involve creating fewer nodes (e.g. (8a)).
(7) The child bought the toy for Kim. (8) a.
S b. S ³PP » »XX XX ³ » » ³ PP » X ³ NP VP VP NP »X X "b "b XX !!a aa » »» X » " b " b ! a ! V the child NP PP the child NP V ©HH ½ Z # c © © H # c ½ Z NP bought the toy for Kim bought PP ##cc ½½ZZ the toy for Kim
(LC involves attaching material low in the tree, the effect of MA is to attach higher up the tree — but only when the cost of low attachment would be more nodes. Notice both LC and MA are very sensitive to the exact form the of the rules in the grammar). Lexical Effects The preferred reading of (9a) is (9b). This involves a violation of MA. (9) a. Sam wanted the dress on the rail. b. Sam wanted [ the dress on the rail ] c. Sam wanted [ the dress ] [ on the rail ] (not in the drawer)
Open Questions • Relation to the Grammar – Competence Hypothesis (e.g. Kaplan and Bresnan (1982)) – Type transparency (e.g. Berwick and Weinberg (1984)) • How is non-determinism realized (backtracking, parallelism)? • Is there a syntactic parsing module? – Evidence for: attachment strategies; – Evidence against: attachment strategies.
Most introductions to psycholinguistics will have a chapter on sentence processing (i.e. parsing), e.g. Carroll (1994, Ch6), Garman (1990, Ch6). Allen (1987, Ch6) provides a straightforward introduction from an NLP perspective. Frazier (1988) is also useful. The first exploration of the sorts of parsing strategy that might be used by humans is Kimball (1973); Bever (1970), and Frazier and Fodor (1978) are other early references. The ‘determinism hypothesis’ was first proposed in Marcus (1980). Shieber (1983) and Pereira (1985) point out that LC and MA (and some Garden Path phenomena) can be modeled by a shift-reduce parser which resolves by two general principles: reduce-reduce conflicts, where the choices is between of two or more right-hand-sides to reduce, are resolved by always choosing the longer right-hand-side; and shift-reduce conflicts are resolved by shifting. (The relationship between Late Closure and choosing longer reductions is more easily seen, Minimal Attachment effects sometimes involve an interaction between the two principles, as well as the precise rules in the grammar; however, it should be clear that the overall effect is to prefer ‘flatter’ structures, which is consistent with the MA). More recent discussion, and up to date references, can be found in Crocker (1995) and Crocker et al. (2000)
References Allen, J. 1987. Natural language understanding. Menlo Park, Ca.: Benjamin Cummings Pub. Co., Inc. Berwick, R.C. and Weinberg, A.S. 1984. The Grammatical Basis of Linguistic Performance. Cambridge, Mass.: MIT Press. Bever, T. 1970. The Cognitive Basis for Linguistic Structures. In J. R. Hayes (ed.), Cognition and the Development of Language, pages 270–353, N.Y.: Wiley. Carroll, David W. 1994. Psychology of language. Pacific Grove, Calif.: Brooks/Cole Pub. Crocker, Matthew, Pickering, Martin and Charles Clifton, Jr (eds.). 2000. Architectures And Mechanisms For Language Processing. Cambridge: Cambridge University Press. Crocker, Matthew W. 1995. Computational Psycholinguistics: An Interdisciplinary Approach to the Study of Language. Dordrecht: Kluwer Academic. Ford, Marilyn. 1982. Sentence planning units: Implications for the speaker’s representation of meaningful relations underlying sentences. In Joan Bresnan (ed.), The Mental Representation of Grammatical Relations, pages 797–827, Cambridge, MA: The MIT Press. Frazier, Lyn. 1988. Grammar and language processing. In Frederick J Newmeyer (ed.), Linguistics: The Cambridge Survey, volume 2, pages 15–34, Cambridge: CUP, lg419. Frazier, Lyn and Fodor, Janet Dean. 1978. The Sausage Machine: a new two stage parsing model. Cognition 6, 291–325. Garman, Michel. 1990. Psycholinguistics. CUP. Kaplan, Ronald M. and Bresnan, Joan. 1982. Lexical-Functional Grammar: A formal system for grammatical representation. In Joan Bresnan (ed.), The Mental Representation of Grammatical Relations, pages 173–281, Cambridge, MA: The MIT Press, reprinted in Mary Dalrymple, Ronald M. Kaplan, John Maxwell, and Annie Zaenen, eds., Formal Issues in Lexical-Functional Grammar, 29–130. Stanford: Center for the Study of Language and Information. 1995.
Kimball, John. 1973. Seven Principles of surface structure parsing in natural language. Cognition 2, 15–47. Marcus, Mitchell P. 1980. A Theory of Syntactic Recognition for Natural Language. Cambridge, Ma.: MIT Press. Pereira, Fernando C. N. 1985. A new characterization of attachment preferences. In David R. Dowty, Lauri Karttunen and Arnold M. Zwicky (eds.), Natural Language Parsing, pages 307–319, Cambridge: Cambridge University Press. Shieber, Stuart M. 1983. Sentence disambiguation by a shift-reduce parsing technique. In ACL Proceedings, 21st Annual Meeting, pages 113–118.