社団法人 電子情報通信学会 THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS

信学技報 TECHNICAL REPORT OF IEICE TL2008-34 (2008-8)

Message Planning and Lexical Encoding in Sentence Production -A View from Japanese Existential Construction- Hajime ONO†

and Hiromu SAKAI‡

†Faculty of Foreign Languages, Kansai Gaidai University Nakamiya Higashinocho 16-1, Hirakata-shi, Osaka, 573-1001 Japan ‡Graduate School of Education, Hiroshima University Kagamiyama 1-1-1, Higashihiroshima-shi, Hiroshima, 739-8524 Japan E-mail: †[email protected], ‡[email protected] Abstract Griffin & Bock (2000) monitored the eye-movement of speakers as they describe simple events and found that the speakers gaze a certain object before they decide what to say, suggesting that the result supports Wundt’s idea of incremental lexical encoding. Sentence formulation, however, involves other stages such as message planning including determination of speakers’ point of view, grammatical function assignment, or linearization of constituents (Bock & Levelt, 1994). To examine the time course of these multiple stages, we conducted an eye-tracking experiment in which the participants were required to describe a picture by an existential construction in Japanese where the elements are aligned as Location-Object word order. The result shows that speakers initially gaze the Object at the message planning stage, which was followed by the lexical encoding stage where speakers’ gaze is shifted to the Location. Key words: production, eye-tracking, gaze, sentence, Japanese, existential construction, word order, linearization, message of view and focus structure, assignment of grammatical

1. Introduction

functions to the nouns, ordering of each elements in the

As the research progresses on the mechanism of

sentence, and so on (Bock & Levelt, 1994). In Griffin &

language production, it becomes obvious that language

Bock (2000), the participants only produce regular transitive

production involves extremely complex processes (Fromkin,

sentences that start with a subject noun phrase (e.g., The

1968; Levelt, 1989). The visual world paradigm, which uses

mouse is squirting the turtle with water). It is well known

an eye-tracking measurement, is getting quite popular to

that the subject noun phrases are not only the first element in

investigate what information the speakers integrate and

a transitive sentence in general, but also they tend to attract

exactly when they integrate various kinds of information in

the speaker’s point of view, and to be salient in the given

the course of producing an utterance (Tannenhaus, et al.,

discourse. Then, it seems fair to say that measurement of the

1995). Griffin & Bock (2000) found that the speakers gaze a

gaze timing of the subject noun phrase by itself does not

certain object before they actually produce the utterance in

give us a good grasp of which stage(s) in sentence

their picture description task, and argued that the timing of

formulation the gaze timing reflects on. We conducted an eye-tracking experiment (the visual

the gaze reflects the process of sentence formulation, rather

world paradigm) in which the participants were required to

than the apprehension of the pictures. Sentence formulation, however, obviously involves

describe a scene by using an existential construction in

multiple stages, such as determination of the speaker’s point

Japanese where the elements are aligned as a non-canonical

― 119 ―

(1)

“Location (Locative PP) / Locatum (Nominative NP) / Verb”

order.

We

call

that

particular

word

order

non-canonical, in the sense that a locative PP precedes a nominative NP. We contrasted the results against the gaze data obtained from the same picture but the participants utter another sentence which involves a canonical “Theme (Nominative NP) / Verb” order. The gaze-onset of the first noun phrase in two different word order conditions would give us information about different stages of speech

to kurippu-ga and clip-NOM

arimasu exist

In the Complex Task block, on the other hand, the participants were asked to describe where the heavier entity locates on the screen. They were required to use the form of sentence as the one shown in (2), in which they have to mention the location of the heavier entity (kurippu-no yoko ‘next to a paper clip’) before they actually mention the locatum object itself (baketu-ga ‘bucket-NOM’). (2)

production model.

kurippu-no yoko-ni baketu-ga arimasu clip-GEN side-at bucket-NOM exist

In the experiment, in all of the target pairs of the pictures,

2. Experim ent

objects were placed side by side, while in some of the filler

2.1 Participants, m aterial, procedure Students at Hiroshima University (N=20) participated the study. Data from 3 participants were eliminated due to the mechanical or other problems during the experiment; then analyses were made based on the remaining data from 17 participants. 24 pairs of simple line-drawn pictures were prepared as the target pictures for the study. Those pictures are initially selected from the collection (The International Picture Naming Projects / USCD), and further checked by a pilot naming task, so that five native speakers of Japanese labeled the picture with the same name. Among the 24 pairs of pictures, nine pairs were animate entities and other pairs were inanimate. Paring of the pictures were made (e.g., an elephant and fish) so that the one of them is in general heavier than the other. 48 pairs of filler pictures were also selected mostly from the same collection mentioned above. Two task blocks (Simple and Complex) were prepared for the study. In the Simple Task block, the participants were told to describe what they see on the screen, in the sentence form of “[a heavy object] and [a light object] exist,” using the coordinate structure where the heavy object must be uttered first. For instance, when the participants saw a picture of a bucket and a paper clip, they had to utter a sentence such as the one shown in (1), since in general, a bucket is heavier than a paper clip.

baketu bucket

pictures, items are aligned vertically. The participants were told not to use the words “hidari (left)” or “migi (right)” and encouraged to use the word “yoko (side)” so that they do not get confused from left to right. Each participant performed both tasks, and the order between the two task blocks was counter-balanced. Using 24 pairs of pictures, four lists were created, by counter-balancing (a) where each object appears (either left or right on the screen) and (b) in which task block the pair appears (either Simple or Complex task block). Stimuli were presented by using E-Prime (Psychology Software Tools, Inc.). An experimental trial begins by pressing a button in the Serial Response Box (see below). A crosshair appears and remains for 500 ms when the participant presses a button, and the pictures show up and remained on screen for 3500 ms. Participants were instructed to describe the scene as soon as possible, using the appropriate sentence form in a given task blocks. Task sessions were preceded by six practice trials, and the experimenter made sure that the participants understood the whole procedure. The response latency of each trial was recorded by the Voice-Key that is part of the Serial Response Box, working with E-Prime. Also, the utterances were recorded by an IC-Recorder so that the experimenter can check whether the participants followed the procedure and made correct utterances. Eye gaze pattern during the experiment was recorded by

― 120 ―

a Tobii 1750 Eye Tracker (Tobii Technology: Stockholm, Sweden). Participants’ gaze data were recorded while the target pictures were shown on the screen. Recording was done with the sampling rate of 50 Hz; in general, 50 data points were recorded per second. Before the experiment begins, the participants received some introduction of the eye-tracking system, and the calibration was done with five points on the screen. Gaze data recording was done, using the calibration information from each participant. Gaze data was recorded and classified as (a) looking at the picture on the left, (b) looking at the picture on the right, (c) looking neither. A small break was available between the tasks, and

Table 1: The Mean Latencies from Each Block Task Block Mean Latency SD SE First half 1616.5 ms 455 34 Second half 1713.8 ms 463 34 Table 2 shows the mean latencies for the each task condition. Statistically not significant though, the mean latency from the Complex Task condition was slightly longer than that from Simple Task.

Table 2: The Mean Latencies for Each Task Task Mean Latency SD SE Complex 1689.5 ms 455 34 Simple 1642.5 ms 467 35 Next, we will report the gaze data. As illustrated above, two line drawn pictures were presented side by side in the

in general, each task block took about 6 min.

target trials, and one of the objects was in general heavier

2.2 Analysis and Results First, we will report data other than the gaze measurement. The mean accuracy (i.e., the rate of the trials in which each participant follows the instruction accurately on the target trials) from each participant was 88.2% (SD = 10.1). Major errors came from the trials in which the participant mentioned the heavy object first in the Complex Task block or the non-heavy object first in the Simple Task block (and therefore s/he did not follow the instruction of the task), or the trials in which the participant could not name the object correctly, or s/he could not start naming while the picture was on screen, etc.).

than the other. We first calculated the gaze rate (whether the participants look the heavy or non-heavy object) at a given time in each task condition. Figure 1 illustrates the gaze data during the Simple Task, while Figure 2 during the Complex Task. The overall pattern in the Simple Task condition (in which the participants had to mention the heavy object first) revealed that, at 200 ms after the onset of the picture presentation, the gaze rate for the non-heavy object (dark line) was larger than that for the heavy object (gray line), though the difference was very small. But, soon the gaze rate

Second, the latency data were submitted to ANOVA, taking the following three factors: Task (Simple or Complex), Location (the heavy object is on the left or right side of the screen), and Task Block (first or second). The results showed that there was a main effect of Task Block, showing that the latency from the task performed in the first half was shorter than that in the second half (F1=4.84, p<.05; F2=6.14, p<.03). There was no interaction among the three factors. Table 1 shows the mean latencies, SD, and SE classified by the Task Block. Note that the mean latency from the task that has been performed in the first half (whichever the task was) was about 100 ms shorter than that from the task that has been performed in the second half.

Figure 1: The mean gaze rate (vertical) for each time point (horizontal; 50 Hz sampling rate) obtained during the Simple Task (until 3 sec after the onset of picture presentation). The mean rate of gazing to the heavy object is shown in a thick line; the non-heavy object in a thin line. Since the participants may look at neither object, the total rate does not add up to 1.

― 121 ―

Table 3: Statistical Results of the Gaze Rates Time Frame (ms)

Task

Gaze p

Task x Gaze

F1, F2

p

F1, F2

p

< 1000

13.32 3.92

< 0.01 * < 0.07 #

16.38 18.90

< 0.01 * < 0.01 *

< 600

6.01 4.75

< 0.03 * < 0.05 *

< 400

3.57 3.80

< 0.08 # < 0.07 #

3.62 3.94

< 0.08 # < 0.06 #

F1, F2

non-heavy object.

Figure 2: The mean gaze rate (vertical) for each time point (horizontal; 50 Hz sampling rate) obtained during the Complex Task (until 3 sec after the onset of picture presentation).

Next, we report the interaction between Task and Gaze. Interactions between the two factors were observed in (a) 0

for the heavy object increased, and the pattern remains the

to 1000 ms after the picture onset and (c) 0 to 400 ms after

same for about 1000 ms. Then, the pattern got reversed. The

the picture onset, but clearly the two interactions were

gaze rate for the non-heavy object was larger than that for

qualitatively different. First, the interaction observed in 0 to

the heavy object. The pattern continued at least until 3000

1000 ms after the picture onset revealed that there was a

ms after the picture onset. As for the overall pattern in the

simple main effect of Gaze in the Simple Task, due to the

Complex Task condition (in which the participants had to

large gaze rate for the heavy object (F1=31.02, p<.01;

name the non-heavy object first), as seen in Figure 2, not

F2=14.98, p<.01). On the other hand, there was no simple

only what the participants looked at, but also the overall

main effect of Gaze in the Complex Task (Fs<1). Now, as

pattern was a lot different, compared with that in the Simple

for the interaction observed during 0 to 400 ms after the

Task condition. At 200 ms after the picture onset, there was

picture onset, there was a significant simple main effect of

a major increase of the gaze rate for the heavy object for a

Gaze in the Complex Task due to the larger gaze rate for the

short period of time. Then around 800 ms after the picture

heavy object (F1=5.35, p<.04; F2=6.26, p<.03). On the other

onset, the gaze rate for the non-heavy object exhibited a

hand, there was no simple main effect in the Simple Task

sharp increase. Then finally, after 1400 ms after the picture

(Fs<1).

onset the participants continued to look at the heavy object.

3. Discussion and Sum m ary

Here, we report the statistical results for the gaze patterns. Data were submitted to ANOVA with repeated measure, taking Task (Simple vs. Complex) and Gaze (Heavy or Non-Heavy) as within-subject factors. Only significant or marginally significant results are reported (Table 3). The statistical results were calculated based on three different time frames: (a) 0 to 1000 ms after the picture onset, (b) 0 to 600 ms, and (c) 0 to 400 ms. In either time frame, there was a significant or marginally significant main effect of Gaze, which seems to be due to the large gaze rates for the heavy object in general, compared with that for the

Recall that we have observed that the response latency for the task, measured by the Voice Key Device, was shorter in the first half of the experiment than the one in the second half of the experiment. Usually, it is likely that the participants response to the trials faster as the experiment progresses, since they got used to the procedure, but in our current experiment, such facilitation was not observed. Also, this seems to have nothing to do with which task condition, either Simple or Complex, the participants engaged with. However, we need to be careful to draw any strong conclusion from this result. First, in our experiment, we did

― 122 ―

not eliminate the trials in which the participants uttered

a sentence. For instance, it is claimed that there is a message

disfluencies (i.e., “etto,” “eee,” etc). Therefore, it is not quite

planning stage in which the speaker decides the point of

clear whether the “recorded” response latencies actually and

view or what to be focused in the utterance. Also, it is

accurately correspond well to the very onset of the first noun

argued that there is a stage in which grammatical relations

phrase the participants uttered in the experiment. Obviously,

and thematic role assignment to noun phrases are determined.

we would like to overcome such technical limitations, but

Finally, there is a stage in which the speaker decides the

that was left for future studies. Furthermore, the participants

word order among the elements in the sentence. We use their

were asked to mention the heavy object first in the Simple

model as we discuss the gaze data.

Task condition and the non-heavy object first in the

There is a common pattern found in both tasks: a major

Complex Task condition; counter-balance between the task

gaze switching from the heavy to non-heavy object in the

conditions was not perfect. We obviously plan to run an

Simple Task condition, and one from the non-heavy to

experiment in which the participants were asked to mention

heavy object in the Complex Task condition. Such patterns

either the heavy object or the non-heavy object first,

can be interpreted quite naturally that the major gaze shifts

depending on the task condition they are in. That way, we

reflect the task requirement. In the Simple Task condition,

can obtain a much finer measurement of the response

the participants have to mention the heavy object first, and

latencies. Therefore, at this moment, it is quite hard to

in the Complex Task condition, they have to mention the

speculate whether and to what extent the response latency

non-heavy object first. In other words, In the Simple Task

reflects any specific stage(s) in the production processes, in

condition, the participants gaze the heavy object for about

fully meaningful ways.

1000 ms, then, they turn to look at the non-heavy object. In

Next we discuss the results of the gaze measurement. As

the reverse way, in the Complex Task condition, the

mentioned in Introduction, the production model argued by

participants gaze the non-heavy object for about 1000 ms,

Bock & Levelt (1994) specifies multiple levels in producing

then they switch to look at the heavy object to mention them. Following the model by Bock & Levelt (1994), this gaze pattern might reflect the process in which elements in the sentence got linearized or the process in which the linearized materials are converted to sounds to be pronounced. Also, considering the timing of this gaze switching, it could be speculated that the gaze switching occurs 300-400 ms before the participants actually start to utter anything, if we, somewhat seriously, believe that the overall reaction latencies for the task was about 1650-1700 ms after the

Figure 3: The mean gaze rate (vertical) for each time point (horizontal; 50 Hz sampling rate), until 3 sec after the onset of picture presentation. The rate was calculated by [the gaze rate for the heavy object] − [the gaze rate for non-heavy object] in the Complex Task condition. In order to make a similar comparison, we multiplied the gaze rate in the Simple Task condition by −1. The lightly shaded area represents the time frame of 0 to 400 ms approximately.

picture onset. But, as we have discussed above, the sensitivity of the reaction latencies was not satisfactory, then it is hard to draw a strong conclusion for that point. It is quite noteworthy to observe that the difference of the gaze patterns between the two task conditions was not simply due to the difference on the task requirement in terms of which object to be named first. In fact, the graph in Figure

― 123 ―

1 cannot be produced just by switching the heavy and

requires rather complicated operations in the message

non-heavy objects in Figure 2. In order to highlight that

planning stage or the stage in which the grammatical

point, we calculated the mean difference gaze rates, as

function is determined. It could be that the gaze pattern for

shown in Figure 3.

the heavy object that has been observed in the early time

We can compare how gaze patterns shifted in two different task conditions. In the Complex Task condition,

frame can be considered as the cost increase due to the complications in those stages.

within 50 ms after the picture onset, the gaze rate for the

Summarizing, the results of the present study showed

heavy object is much larger than that for the non-heavy

that eye-tracking measurement during the production of non

object, suggesting that there is a strong motive to look at the

canonical sentence pattern, such as Japanese locative

heavy object. On the other hand, such a major difference

sentences, provide useful information for investigating

was not found in the Simple Task condition. This

multiple stages assumed by the current speech production

observation was confirmed by the significant interaction

model. Investigating whether the cost increase arises from

found in the time frame 0 to 400 ms after the picture onset.

the message planning stage or the grammatical function

Let us examine this finding by the model by Bock & Levelt

assignment stage is left for the future study.

(1994). Let us consider how our finding can be fit into the speech production model by Bock & Levelt (1994). In order to perform the task requirement, the participants must first determine which object is heavier. That has a consequence for which object the participants have to mention first. In the Simple Task condition, the order between the objects was reflected the order in the subject noun phrase, as producing a sentence form in (3). (3)

[Heavy AND Non Heavy]-NOM exist

However, those two objects were treated equally in terms of the properties regarding the message planning level, such as the speaker’s point of view, focus, as well as the thematic role they bear in the sentence. On the other hand, in the Complex Task condition, the decision of which object is heavy will create a major consequence regarding the grammatical property of the sentence of which object will become the subject noun phrase. The participants need to assign the subject role to the heavy object, and the locative role to the non-heavy object in the sentence form in (4). (4)

[Non Heavy]-SIDE-AT [Heavy]-NOM exist

Also in the Complex Task condition, not the locative expression, but the subject noun phrase is focused, so that

4. References [1] J. K. Bock, W. J. M. Levelt, “Language production: Grammatical encoding,” Handbook of psycholinguistics, eds., M. A. Gernsbacher, pp. 945–984, Academic Press, San Diego, CA, 1994. [2] V. Fromkin, “Speculations on performance models,” Journal of Linguistics, 4, pp.1–152, 1968. [3] Z. Griffin, K. Bock, “What the Eye Say about Speaking,” Psychological Science, 11, 4, pp.274–279, 2000. [4] W. J. M. Levelt, Speaking: From intention to articulation, MIT Press, Cambridge, MA, 2002. [5] M. K. Tanenhaus, M. J. Spivey-Knowlton, K. M. Eberhard, J. C. Sedivy, “Integration of visual and linguistic information in spoken language comprehension,” Science, 268, 1632–1634, 1995. Acknowledgment This research was supported by a Grant-In-Aid for Scientific Research (B) “Neurocognitive basis for the processing of compound words in spoken language” (PI: Hiromu Sakai, #17320064) by the Japan Society for Promotion of Science, by a Grant-in-Aid for Scientific Research on Priority Areas (System study on higher-order brain functions) “Cortical functions for recursive linguistic computation: An investigation through syntactic and lexical priming” from the Ministry of Education, Culture, Sports, Science, and Technology of Japan (PI: Hiromu Sakai, #18020022), and by a Grant-In-Aid for Scientific Research (B) “An integrated research on eventuality expressions based on ChineseJapanese contrastive studies” (PI: Shen Li, #19320064) by the Japan Society for Promotion of Science.

producing a sentence in the Complex Task condition

― 124 ―

Message Planning and Lexical Encoding in Sentence ...

Key words: production, eye-tracking, gaze, sentence, Japanese, existential construction, word order, linearization, message. 1. Introduction .... Software Tools, Inc.). ... eye-tracking system, and the calibration was done with five points on the ...

226KB Sizes 0 Downloads 176 Views

Recommend Documents

Message Planning and Lexical Encoding in Sentence ...
another sentence which involves a canonical “Theme. (Nominative NP) / Verb” ... neither object, the total rate does not add up to 1. a Tobii 1750 Eye Tracker ...

Cues, constraints, and competition in sentence processing
sentence processing, significant controversies remain over the nature of the underlying ...... Psychology: Learning, Memory, and Cognition, 16, 555-568. Fisher ...

Planning at the Phonological Level during Sentence ... - Springer Link
here is the degree to which phonological planning is radically incremental during sentence production, e.g., whether ..... were paid or received credit for an introductory psychology course. All were native English ..... The baseline condition produc

MESSAGE
your labor from long years of acquiring basic knowledge and skills from your dear Alma Mater. Let me be with you' giving ... ardor and diligence. Don't be scared.

The Role of the Syllable in Lexical Segmentation in ... - CiteSeerX
Dec 27, 2001 - Third, recent data indicate that the syllable effect may be linked to specific acous- .... classification units and the lexical entries in order to recover the intended parse. ... 1990), similar costs should be obtained for onset and o

Encoding Demonstrations and Learning New ...
Encoding Demonstrations and Learning New Trajectories using Canal Surfaces. ∗. S. Reza ... are fast and can be used in an online manner, but the repro- ..... ing the distance from each point on the directrix to the cor- responding points on the ...

Conceptual accessibility and sentence production in a ...
Moreover, the data inform broader theoretical issues, such as the extent to which sentence production can .... the way in which accessibility affects sentence production. Odawa .... Moreover, as demonstrated in the present experiment, ..... Odawa's f

in On-line Sentence Comprehension - ScienceDirect.com
3 Apr 1998 - The time-course with which readers use event-specific world knowledge (thematic fit) to ... times matched the predictions of the constraint-based version of the model but differed substan- tially from a one-region ... Real-time language

Geometric Encoding
troid and approximate length of the space is a. 85 summary parameter derived from the boundaries. 86. Thus, global geometric coding with respect to the. 87 principal axis would involve encoding the correct. 88 location during training as the location

Google Message Discovery - SPAM in a Box
compared to software or appliance based solutions. ... Always On, Always Current Message Security: Routing messages through Google's market-leading.

Google Message Discovery - SPAM in a Box
Best of all, it's hosted by Google. So there's no hardware or software to download, install or maintain. ... compared to software or appliance based solutions. New Age of Archiving and Discovery. In today's .... all-inclusive per-user fee? No more co

Google Message Encryption - SPAM in a Box
any additional software, hardware, or technical training. • Automatic ... Auditable protection of emails containing regulated or company proprietary information.

pdf-1594\lexical-processing-in-second-language-learners-papers ...
... apps below to open or edit this item. pdf-1594\lexical-processing-in-second-language-learn ... aul-meara-second-language-acquisition-from-brand.pdf.

Problems in replicating studies that rely on lexical ...
In the domain of linguistics and other social sciences, using replication as a method of validating research for the acceptance of new theories and knowledge is ...

Google Message Encryption - SPAM in a Box
dictate that your organization must secure electronic communications. Whether it is financial data ... document hosting and collaboration),. Google Page ... Edition (K-12 schools, colleges and universities) and Premier Edition (businesses of all size

Hierarchical Preferences in a Broad-Coverage Lexical ...
We found that for a large fraction of nouns, more than. 84%, there is a superordinate which ... data, and from hierarchical knowledge relating words – to try to characterize ... predicting distinctive attributes of their members and, at the same ti

Text and Speech Encoding - F12 Language and Computers
Each pattern represents a character, but some frequent words and letter combinations have their own pattern. ... used mainly for non-Chinese loan words, onomatopoeic words, foreign names, and for emphasis ... 1991: Back to Latin alphabet, but slightl

Message Mate
God's grace filled the church and became a bridge of respect and trust ... of what writer Howard Snyder calls kingdom people rather than church people.

Message Mate
The word channels refers to canals or irrigation ditches that run in various ... We can stretch to its breaking point this tension between divine sovereignty and ...

Message Mate
the family of God . . . yet how rare! ... GROWING UP IN GOD'S FAMILY ... copyright © 1985 and Message Mate copyright © 2016 by Charles R. Swindoll, Inc.