Problems in replicating studies that rely on lexical frequencies Heather Swanson Carleton University In the domain of linguistics and other social sciences, using replication as a method of validating research for the acceptance of new theories and knowledge is often overlooked, as replication does not have prestige and as such, there is no avenue for publication (Porte, 2013). However, as replication is a crucial component to the scientific method, the lack of replication in the social behavioural sciences stands out for not having support in its research validity (Porte, 2013). Importantly, when replication studies are done, it is often found that the original study revealed a significant result, while the replication does not (Maxwell, Lau, & Howard, 2015). The present study looks to contribute to filling the gap that exists in linguistics, by replicating a morphological study by McCormick, Brysbaert, and Rastle (2009). McCormick et al. (2009) was chosen to replicate as it contributes to the decades-old morphological debate between word-based theoretical models such as Paradigm Functional Morphology (PFM; Stump, 2001) and morpheme-based models such as Distributed Morphology (DM; Valle & Marantz, 1994). McCormick, et al. (2009) aimed to test whether high frequency complex words showed evidence of masked priming in the same way as low frequency conditions as seen in Rastle, Davis, & New (2004) The frequencies in the study were defined using CELEX (Baayen, Piepenbrock, & van Rijn, 1993). It was found that high frequency words did in fact have masked priming in the same way as the low frequency conditions. The results of the study support a complexity-based theory of routine decomposition of words regardless of the frequency of the words. To replicate McCormick et al. (2009) there were two parts: the first was to reproduce the stimuli and the second was to replicate the experiment’s results. In attempting to recreate the materials, the Corpus of Contemporary American English (COCA; Davies, 2008) was used to determine token frequencies. Variation was found in the way words were categorized as either high or low frequency, based on data in the CELEX database, compared to the same words in COCA. In some cases, a word categorized as high frequency in CELEX was found to be low frequency in COCA and vice versa. Additionally, there was variation within the high and low categories themselves, with some words described as higher frequency than another in CELEX; while the opposite was found in COCA. The fact that the lexical frequencies of the tokens differed so dramatically between CELEX and COCA, suggests that fully replicating McCormick et al. (2009) will be difficult. Furthermore, if lexical frequencies can differ amongst corpora, how can we be sure we are accurately representing lexical frequency? Since the findings of many linguistics studies are based on lexical frequencies, it is extremely important that we are able to accurately capture this information. References Baayen, R. H., Piepenbrock, R., & van Rijn, H. (1993). The CELEX lexical database [CD-ROM]. Philadelphia: University of Pennsylvania, Linguistic Data Consortium. Davies, Mark. (2008-) The Corpus of Contemporary American English: 520 million words, 1990present. Available online at http://corpus.byu.edu/coca/. Halle, M. & Marantz, A (1993). Distributed Morphology and the pieces of inflection. In The view from Building 20: Essays in Linguistics in honour of Sylvain Bromberger, eds. Ken Hale and Samuel Jay Keyser, 111-176. Cambridge, Mass: MIT Press. Maxwell, S.E., Lau, M.Y., & Howard, G.S. (2015). Is psychology suffering from a replication crisis? What does “failure to replicate” really mean? American Psychologist, 70(6), 487-489. McCormick, S.F., Brysbaert, M., Rastle, K. (2009). Is morphological decomposition limited to low-frequency words? The Quarterly Journal of Experimental Psychology, 62 (9), 1706-1715. Porte, G. (2013). Who Needs Replication? CALICO Journal, 30(1), 10-15 Rastle, K., Davis, M. H., & New, B. (2004). The broth in my brother’s brothel: Morphoorthographic segmentation in visual word recognition. Psychonomic Bulletin & Review, 11, 1090–1098. Stump, G. T., (2001) Inflectional Morphology: A Theory of Paradigm Structure. Cambridge: Cambridge University Press