Weight-based processing biases in a connectionist ...

Viewer
Transcript

Weight-based processing biases in a connectionist model of English and Japanese syntax acquisition Franklin Chang NTT Communication Sciences Laboratories, NTT Corp., Kyoto, Japan [email protected]

Abstract

Testing Heavy NP shift

Since English and Japanese speakers have different word order biases in heavy NP shift, language acquisition must play a role in the development of the shift. To explore how this might happen, a connectionist model of syntax acquisition and sentence production was trained to produce English or Japanese sentences, and the resulting models were tested for their heavy NP shift behavior. The behavior of the models matched the human biases in each language and language-specific features of the input were found to be important in the development of these word ordering behaviors.

• Heavy NP shift was tested in the model by presenting dative messages with Long Patient, Long Recipient, or All Light. Below is an example of a Long Patient message. Message: 0A=SEND 0X=GIRL,INDEF 0Y=TOY,INDEF 0Z=BOY,INDEF 0E=PRESENT,SIMPLE,AA,XX,ZZ,YY,0Y1Y 1A=FLOAT 1Y= TOY,INDEF 1E= PRESENT,SIMPLE,AA,YY English: a girl send -ss a boy a toy that float -ss. Japanese: girl ga boy ni float toy wo send .

Heavy NP Shift in English and Japanese • Heavy NP shift is a tendency for speakers to place long "heavy" noun phrases in noncanonical sentence positions. • Ungrammatical/Awkward: "The boy gave to the girl the book" • Grammatical: "The boy gave to the girl the book that he bought last week" • English speakers tend to place long noun phrases LATER in sentences (Stallings, MacDonald, & O'Seaghdha, 1998) • Japanese speakers tend to place these long phrases EARLIER in sentences (Hawkins, 2004; Hakuta, 1981; Yamashita & Chang, 2001) • These language-specific heavy NP shift biases seem to manifest early in language development in both languages (de Marneffe et al., 2007; Hakuta, 1981)

Words

A Connectionist Model of Syntax Acquisition and Sentence Production • Chang, Dell, & Bock (2006) provided a connectionist model of syntax acquisition and sentence production that learns abstract syntactic representation in order to map between meaning/messages and word sequences. • Dual-path Architecture: sequencing system that learns syntactic representations, meaning system that learns lexical-concept links. • Model accounts for a wide range of data (structural priming, syntax acquisition, aphasia) • Model trained on message-sentence pairs generated from input grammar.

• The dependent measure was the number of Recipient before Patient structures. The English data show that this recipient early order is more common with long patients than long recipients, and the Japanese data show the reverse bias. This pattern is significant over the whole of development (weight*language interaction, t = 13.89, p < 0.001) and that is consistent with developmental data in both languages (de Marneffe et al., 2007; Hakuta, 1981).

What causes heavy NP shift in the model? • Structure selection in the model is guided by the activation of roles at the position where structural alternatives differ (the choice point is shown in blue below). • English choice point = noun phrase after verb • a girl send -ss a boy a toy that float -ss. RECIPIENT • a girl send -ss a toy that float -ss to a boy .

•

Concepts

• girl ga boy ni float toy wo send . • girl ga float toy wo boy ni send .

Roles

•

Syntax

Event-semantics

RECIPIENT PATIENT (action role of the embedded clause)

Links between Event-semantics and Roles at the choice point by weight and language

Roles PATIENT RECIPIENT

Roles

Roles PATIENT RECIPIENT

Roles PATIENT 1ACTION

English

Concepts Meaning System

PATIENT

Japanese choice point = noun phrase after subject

Words

Sequencing System

Japanese and English Input for the Model • The message encoded the concept-role bindings in the event as well as event-semantic

information like tense and aspect. The message was paired with an English and Japanese sentence. Below is example dative. MESSAGE: A=GIVE X=GIRL,DEF Y=KITE,INDEF Z=BOY,DEF ES=PAST,SIMPLE,AA,XX,ZZ,YY English: The father give –ed the boy a kite Japanese: father wa boy ni kite wo give –ta

• Unlike previous models, the present model can handle messages with two propositions. MESSAGE: 0A=SHOW 0X=BOY,DEF 0Y=ORANGE,DEF 0Z=DOG,DEF 0E=PRESENT,SIMPLE,AA,XX,ZZ,YY,0Z1Y 1A=MOVE 1Y=DOG,DEF 1E=PRESENT,PROGRESSIVE,AA,YY English: a boy show -ss a dog that is move -ing the orange Japanese: boy ga move –te iru dog ni orange wo show

• Differences between English and Japanese: English

Japanese

Verb-initial

Verb-final

Head before relative clause

Head after relative clause

Pronouns

Argument omission

Alternations involve changes in syntactic functions

Alternations involve scrambling of order of arguments

Heavy Recipient Event-semantics

Roles 1ACTION RECIPIENT

Japanese Heavy Patient Event-semantics

Heavy Recipient Event-semantics

Heavy Patient Event-semantics

Why did the model learn these biases? • Japanese model depends on event-semantics more than English model • Event-semantics emphasizes heavy first bias • Properties of language learned contribute to bias • Japanese: Verb-final, No Articles, Relative Clause (RC) before head • English: Not Verb-final, Articles, Relative Clause after head • Train models on languages that vary all three of these features. All three features interact with weight (RC position * weight, t = 6.49, p < 0.001; Article * weight, t = 6.57, p < 0.001; Verb position * weight, t = 6.33, p < 0.001) • Verbs and articles provide structural cues. Moving verb to end or removing articles reduces their influence, and increases role of event-semantics. • Placing RC before heads means that the embedded clause information is more important at the choice point, and event-semantics has extra information about embedded clauses.

Accuracy of English and Japanese models over development • Twenty training sets were created (each 40000 message-sentence pairs) and used to create twenty model subjects. Model was tested on 400 test sentences. Accuracy in terms of grammaticality and in terms of appropriate message was accessed. • Although the languages differ greatly, the model can learn both to similar levels of accuracy.

• Shows that shift direction is controlled by these types of language features. • Language features --> Bias for Event-semantics --> Heavy NP shift direction

References Chang, F., Dell, G. S., & Bock, J. K. (2006). Becoming syntactic. Psychological Review, 113(2), 234-272. Hawkins, J. A. (2004). Efficiency and complexity in grammars. New York City: Oxford University Press. Hakuta, K. (1981). Grammatical description versus configurational arrangement in language acquisition: The case of relative clauses in Japanese. Cognition, 9, 197-236. Stallings, L. M., MacDonald, M. C., & O'Seaghdha, P. G. (1998). Phrasal ordering constraints in sentence production: Phrase length and verb disposition in heavy-NP shift. Journal of Memory and Language, 39(3), 392-417. de Marneffe, M.-C., Grimm, S., Priva, U. C., Lestrade, S., Ouzbek, G., Schnoebelen, T., et al. (2007). A statistical model of grammatical choices in children's production of dative sentences. Formal Approaches to Variation in Syntax, University of York, England. Yamashita, H., & Chang, F. (2001). "Long before short" preference in the production of a head-final language. Cognition, 81(2), B45-B55.

Conclusion • A connectionist model of syntax acquisition was able to explain the different direction of heavy NP shift in English and Japanese. • During language acquisition, the model learns how meaning and syntax influence decisions about word order at different positions in a sentence. • The position where heavy NP shift is determined is different in the two languages, and the relative bias of meaning and syntax at this position yields the difference in the direction of the shift in each language.

Presented at the 33th Boston University Conference on Language Development, Boston