Joanna Jendryczka-Wierszycka Research on Learner Spoken Corpora ZAJęK... i wszystko jasne! IFA 25.04.08 What is LINDSEI? The Louvain International Database of Spoken English Interlanguage ● ● ●
Corpus of spoken learner language 11 different mother tongue backgrounds: Bulgarian , Chinese, Dutch, French, German, Greek, Italian, Japanese, Polish, Spanish, Swedish LOCNEC (the Louvain Corpus of Native English Conversation) )- a comparable corpus of interviews with native speakers of English has been compiled Why do we need LINDSEI?
● ● ● ● ●
to learn about the features of advanced Polglish spoken language to investigate possible L2 learner strategies of Polglish speakers to present the findings in comparison with other L2 background spoken corpus data to juxtapose with a comparable corpus of native English data to compare spoken and written language in a “like with like” manner Polish research on LINDSEI
●
● ● ●
vagueness tags in Polglish presentation for PALC 2007 post conference article in press vagueness tags in Pollish presentation for PLM 2007 negative adjective premodification presentation for EUROCALL 2008 discourse markers paper for TALC 2008 LINDSEI webpages
● main LINDSEI webpage http://www.fltr.ucl.ac.be/FLTR/GERM/ETAN/CECL/Cecl-Projects/Lindsei/lindsei.htm ● Center for English Corpus Linguistics (CECL) at the University of Louvain http://www.fltr.ucl.ac.be/fltr/germ/etan/cecl/cecl.html ● Polish LINDSEI webpage (with this handout online) http://joanna.jendryczka.googlepages.com/plindsei
LINDSEI partner institutions & resources CSLP (The Center for Speech and Language Processing) http://ifa.amu.edu.pl/cslp/ Annotation with CLAWS (the Constituent Likelihood Automatic Word-tagging System) http://ucrel.lancs.ac.uk/claws/ Open Source software SoundScriber http://www-personal.umich.edu/~ebreck/sscriber.html OpenOffice http://www.openoffice.org/ selected LINDSEI bibliography Brand, Christiane & Susanne Kämmerer (2006) “The Louvain International Database of Spoken English Interlanguage (LINDSEI): compiling the German component”. In Sabine Braun, Kurt Kohn & Joybrato Mukherjee (eds) Corpus Technology and Language Pedagogy: New Resources, New Tools, New Methods (pp. 127-140). Frankfurt am Main: Peter Lang. Chen, Xiao (2004) “Personal referential strategies in Chinese EFL learners’ oral narratives”. In Anping He, Longyin Wang, Manfei Xu & Xiao Chen (eds) Application of Corpora to Foreign Language Education: Theory and Practice (pp. 351-365). Guangzhou: Guangdong Higher Education Press. Cheng, Chunmei (2005) “Segmental errors in Chinese learners’ oral English”. In Anping He, Guangkeng He, Xiao Chen, Longyin Wang, Manfei Xu and Chunmei Cheng (eds) Research and Application of Corpus Linguistics (pp. 117-120). Changchun: Northeast Normal University Press. De Cock, S. (1998) “A recurrent word combination approach to the study of formulae in the speech of native and non-native speakers of English”. International Journal of Corpus Linguistics 3(1): 59-80. De Cock, S. (2000) “Repetitive phrasal chunkiness’ and advanced EFL speech and writing”. In C. Mair & M. Hundt (eds) Corpus Linguistics and Linguistic Theory. Papers from the Twentieth International Conference on English Language Research on Computerized Corpora (ICAME 20) – Freiburg im Breisgau 1999 (pp. 51-68). Amsterdam & Atlanta: Rodopi. De Cock, S. (2002) “Pragmatic prefabs in learners’ dictionaries”. In A. Braasch & C. Povlsen (eds) Proceedings of the Tenth EURALEX International Congress, EURALEX 2002, Copenhagen, Denmark, August 13-17 2002, Volume II (pp. 471-481). Center for Sprogteknologi. De Cock, S. (2003) Recurrent Sequences of Words in Native Speaker and Advanced Learner Spoken and Written English. A Corpus-driven Study. Unpublished PhD Thesis. Université catholique de Louvain.
De Cock, S., S. Granger, G. Leech & T. McEnery (1998) “An automated approach to the phrasicon of EFL learners”. In S. Granger (ed.) Learner English on Computer (pp. 6779). London & New York: Addison Wesley Longman. Gilquin, Gaëtanelle (2008) “Hesitation markers among EFL learners: pragmatic deficiency or difference?”. In Jesús Romero-Trillo (ed.) Pragmatics and Corpus Linguistics: A Mutualistic Entente. Berlin, Heidelberg & New York: Mouton de Gruyter. In press. Hugon, Claire (2007) Disentangling Register and Transfer Effects in Second Language Acquisition. A Corpus-based Study of High-frequency Verbs in Native and Learner Speech and Writing. Unpublished M.A. Thesis. Université catholique de Louvain. Jendryczka-Wierszycka, Joanna (2006) Lexical Bundles in Polish Learner Speech: A Study Based on the PLINDSEI Corpus of Spoken Learner English. Unpublished M.A. Thesis. Adam Mickiewicz University. Jendryczka-Wierszycka, Joanna (in press) “Vagueness in Polglish Speech”. In LewandowskaTomaszczyk (ed) PALC'07: Practical Applications in Language and Computers. . Papers from the International Conference at the University of Łódź, 19-22 April 2007. Frankfurt am Main: Peter Lang Verlag. Kämmerer, Susanne (2006) Interference in Advanced Spoken English Interlanguage: A Corpus-based Study. Unpublished Diploma Thesis. University of Giessen. Kaneko, T. (2002) “Potential of learner corpora for SLA research and language teaching”. Gakuen 741: 1-15. Tokyo, Japan: Showa Women’s University. Kaneko, T. (2004) “The use of past tense forms by Japanese learners of English.” In Junsaku Nakamura, Nagayuki Inoue and Tomoji Tabata (eds) English Corpora under Japanese Eyes (pp. 215-228). Amsterdam: Rodopi. Kaneko, T. (2007) “Why so many article errors? Use of articles by Japanese learners of English”. Gakuen 798: 1-16. Tokyo, Japan: Showa Women’s University. Kaneko, T., T. Kobayashi & M. Takami (2006) “The use of emotional expressions in English by non-native speakers: A corpus-based comparative study”. Gakuen 785: 45-55. Tokyo, Japan: Showa Women’s University. Kobayashi, T. (2004) “How positive emotional expressions are used in EFL learners of English” (written in Japanese). English Corpus Studies 11: 37-47. Tokyo, Japan: Japan Association for English Corpus Studies. Mukherjee, Joybrato & Jan-Marc Rohrbach (2006) “Rethinking applied corpus linguistics from a language-pedagogical perspective: new departures in learner corpus research”. In Bernhard Kettemann & Georg Marko (eds) Planing, Gluing and Painting Corpora: Inside the Applied Corpus Linguist's Workshop (pp. 205-232). Frankfurt am Main: Peter Lang. Mukherjee, Joybrato (2007) “Exploring and annotating a spoken English learner corpus: a work-in-progress report”. In Sabine Volk-Birke & Julia Lippert (eds) Anglistentag 2006 Halle: Proceedings (pp. 365-375). Trier: WVT. Mukherjee, Joybrato (2008) “The grammar of conversation in advanced spoken learner English: learner corpus data and language-pedagogical implications”. In Karin Aijmer (ed.) Corpora and Language Teaching. Amsterdam: John Benjamins. In press. Pulcini, V. & A. Damascelli (2003) “A corpus-based study of the discourse marker okay”. In A. Bertacca (ed.) Historical Linguistic Studies of Spoken Discourse (pp. 231-243). Pisa: Edizioni Plus. Pulcini, V. & C. Furiassi (2004) “Spoken interaction and discourse markers in a corpus of
learner English”. In A. Partington, J. Morley & L. Haarman (eds) Corpora and Discourse (pp. 107-123). Bern: Peter Lang. Ramírez, M.D. & J. Romero (2005) “The pragmatic function of intonation in L2 discourse: English tag questions used by Spanish speakers”. Intercultural Pragmatics 2: 151-168. Romero, J. (2002). “The pragmatic fossilization of discourse markers in non-native speakers of English”. Journal of Pragmatics 34: 769-784. Takami, M. (2002) “Japanese EFL learners’ use of coordinating conjunctions in spoken/written discourse”. Gakuen 741: 101-112. Tokyo, Japan: Showa Women’s University. Vorherr, Annedore (2006) Collocations in Spoken English Learner Language: A Corpus-based Study. Unpublished Diploma Thesis. University of Giessen. Zhang, Jinping (2005) “Small words used by English learners: A corpus-based comparative study”. In Anping He, Guangkeng He, Xiao Chen, Longyin Wang, Manfei Xu and Chunmei Cheng (eds) Research and Application of Corpus Linguistics (p. 185). Changchun: Northeast Normal University Press.