Journal of Research Design and Statistics in Linguistics and Communication Science, Vol 5, No 1-2 (2018)

The Discriminatory Power of Lexical Context for Alternations: An Information-theoretic Exploration

Stefan Th. Gries
Issued Date: 19 Dec 2019


This paper makes a very exploratory, tentative, and thinking-aloud kind of suggestion for the corpus-based analysis of alternation data. I start from the observation that studies of alternations/choices in particular in corpus linguistics have become increasingly sophisticated in terms of the statistical methods they employ and the number of predictors they involve. While the predictors employed come from many different levels of linguistic analysis – phonology, morphosyntax, semantics, pragmatics/ discoursal, textual, psycholinguistic, sociolinguistic, and others – they are usually contextual in nature, meaning they characterize the context of the choice the language user needs to make or has just made. However, one aspect of the context seems to be crucially underutilized when it comes to modeling speakers’ choices: the lexical context. In this paper, I build on recent work in computational psycholinguistics to: (a) define a lexical-distribution prototype of each of the (typically, but not necessarily, two) alternants of an alternation; and (b) compute the degree to which each instance of the alternation in question diverges from each of the prototypes. Then, (c) the values that all choices score on the divergences from each of the prototypes are entered as predictors to all others in statistical models to, minimally, serve as a variable that controls for whatever information is contained in the lexical context of an instance of speaker’s choice. I exemplify the approach and its sometimes amazing predictive power on the basis of a choice between near synonyms, two morphosyntactic alternations (preposition stranding vs. pied-piping and of- vs. s genitives), and a distinction between the functions of well.

Download Media

PDF Subscribers Only

DOI: 10.1558/jrds.38227


Baayen, R. H., Milin, P., Ðurđević, D. F., Hendrix, P. and Marelli, M. (2011). An amorphous model for morphological processing in visual comprehension based on naive discriminative learning. Psychological Review 118 (3). 438-481.


Behaghel, O. (1909). Beziehungen zwischen Umfang und Reihenfolge von Satzgliedern. Indogermanische Forschungen 25. 110-142.


Bock, J. K. (1986). Syntactic persistence in language production. Cognitive Psychology 18 (3). 355-387.


Chen, P. (1986). Discourse and particle movement in English. Studies in Language 10 (1). 79-95.


Church, K. W. and Hanks, P. (1990). Word association norms, mutual information, and lexicography. Computational Linguistics 16 (1). 22-29.


Church, K. W., Gale, W., Hanks, P,, Hindle, D. and Moon, R. (1994). Lexical substitutability. In B. T. S. Atkins and A. Zampolli (Eds), Computational Approaches to the Lexicon, 153-177. Oxford: Oxford University Press.


Estival, D. (1985). Syntactic priming of the passive in English. Text 5 (1-2). 7-21.


Givón, T. (Ed.). (1983). Topic Continuity in Discourse: A Quantitative Cross-language Study. Amsterdam and Philadelphia: John Benjamins.


Gries, S. Th. (2001). A corpus-linguistic analysis of -ic and -ical adjectives. ICAME Journal 25. 65-108.


Gries, S. Th. (2003a). Multifactorial Analysis in Corpus Linguistics: A Study of Particle Placement. London and New York: Continuum Press.


Gries, S. Th. (2003b). Testing the sub-test: a collocational-overlap analysis of English -ic and -ical adjectives. International Journal of Corpus Linguistics 8 (1). 31-61.


Gries, S. Th. (2018). Preposition stranding in English: Predicting speakers' behaviour. In V. Samiian (Ed.), Proceedings of the Western Conference on Linguistics. Vol. 12, 230-241. California State University, Fresno, CA.


Gries, S. Th. (2005). Syntactic priming: A corpus-based approach. Journal of Psycholinguistic Research 34 (4). 365-399.


Gries, S. Th. (2010). Behavioral Profiles: A fine-grained and quantitative approach in corpus-based lexical semantics. The Mental Lexicon 5 (3). 323-346.


Gries, S. Th. (2018a). On over- and underuse in learner corpus research and multifactoriality in corpus linguistics more generally. Journal of Second Language Studies 1 (2). 276-308.


Gries, S. Th. (2018b). Syntactic alternation research: Taking stock and some suggestions for the future. Belgian Journal of Linguistics 31 (1). 8-29.


Gries, S. Th. and Stefanowitsch, A. (2004). Extending collostructional analysis: A corpus-based perspective on 'alternations'. International Journal of Corpus Linguistics 9 (1). 97-129.


Gries, S. Th., Heller, B. and Funke, N. S. (under revision). The role of gender in postcolonial syntactic choice-making: Evidence from the genitive alternation in British and Sri Lankan English. In T. J. Bernaisch (Ed.), Gender in World Englishes. Cambridge: Cambridge University Press.


Hale (2001) [Reference to come]


Harris, Z. S. (1954). Distributional structure. Word 10 (2-3), 146-162.


Hunston, S. and Francis, G. (2000). Pattern Grammar: A Corpus-driven Approach to the Lexical Grammar of English. Amsterdam and Philadelphia: John Benjamins.


Jaeger, T. F. and Snider, N. (2008). Implicit learning and syntactic persistence: Surprisal and cumulativity. In B. C. Love, K. McRae, and V. M. Sloutsky (Eds), Proceedings of the Cognitive Science Society Conference, 1061-1066.


Lester, N. A. (2018). The syntactic bits of nouns: How prior syntactic distributions affect comprehension, production, and acquisition. Unpublished Ph.D. dissertation, UC Santa Barbara.


Lester, N. A. (to appear). That's hard: Relativizer use in spontaneous L2 speech. International Journal of Learner Corpus Research.


Levin, B. (1991). English Verb Classes and Alternations: A preliminary Investigation. Chicago, IL: University of Chicago Press.


Milin, P., Ðurdević, D. F., del Prado Martín, F. M. (2009). The simultaneous effects of inflectional paradigms and classes on lexical recognition: Evidence from Serbian. Journal of Memory and Language 60 (1): 50-64.


Pawley, A. and Syder, F. H. (1983). Two puzzles for linguistic theory: Nativelike selection and nativelike fluency. In J. C. Richards and R. W. Schmidt (Eds), Language and communication, 191-225. London: Longman.


Rohdenburg, G. (2003). Cognitive complexity and horror aequi as factors determining the use of interrogative clause linkers in English. In G. Rohdenburg and B. Mondorf (Eds), Determinants of Grammatical Variation in English, 2305-250. Berlin and New York: Mouton de Gruyter.


Rosenbach, A. (2002). Genitive Variation in English: Conceptual Factors in Synchronic and Diachronic Studies. Berlin and New York: Mouton de Gruyter.


Rühlemann, C. and Gries. S. Th. (to appear). How do speakers disambiguate multi-functional words? The case of well. Functions of Language.


Szmrecsanyi, B. (2006). Morphosyntactic Persistence in Spoken English: A Corpus Study at the Intersection of Variationist Sociolinguistics, Psycholinguistics, and Discourse Analysis Berlin and New York: Mouton de Gruyter.


Wolk, C., Bresnan, J., Rosenbach, A., and Szmrecsanyi, B. (2013). Dative and genitive variability in Late Modern English: exploring cross-constructional variation and change. Diachronica 30 (3). 382-419.


Wulff, S., Gries, S. Th. and Lester, N A. (2018). Optional that in complementation by German and Spanish learners. In A. Tyler, L. Huan, and H. Jan (Eds), What is Applied Cognitive Linguistics? Answers from Current SLA Research, 99-120. Berlin & Boston: De Gruyter Mouton.




  • There are currently no refbacks.

Equinox Publishing Ltd - 415 The Workstation 15 Paternoster Row, Sheffield, S1 2BX United Kingdom
Telephone: +44 (0)114 221-0285 - Email:

Privacy Policy