Computational Simulations of Language Behavior
LIX015M05, 2011/12, semester IIa

Introduction Course plan Bibliography


The regular meetings for this course are only on Wednesdays at 15:00 in room 1312.0118. We will not meet on Tuesdays.


This course on the use of simulations for studying language behavior may vary in focus, e.g. focusing one year on the acquisition of (one of) phonology, morphology or syntax, or another year on sentence processing and lexical access, or on the (social) diffusion of linguistic variation. The intention is that students familiarize themselves with the area under study in a given year, and thereby with the more general opportunities and limitations on the use of simulations in scientific inquiry.

This year's focus will be on simulating language acquisition in general, with an excursion to simulating language change. We will particularly focus on early word learning and segmentation. Students are encouraged to suggest areas of language behavior they are interested in as topics of discussion during this course.


Çağrı Çöltekin & John Nerbonne


After completing this course, students should understand the use of simulations in cognitive sciences, be familiar with the modeling and simulation work in the area of language acquisition, and be able to (critically) read the recent research in the field.


The course assumes familiarity with basic concepts of machine learning, but there will be time to review occasionally unfamiliar concepts as well. The course is aimed at students in research masters' programs, which assumes serious motivation and scientific maturity.

Course plan (tentative)

Date Subject Reading Speaker
Feb 15 Introduction & organization Coltekin (2011, chapter 2&3) C.Coltekin (slides)
Feb 22 Computational simulation of language acquisition Coltekin (2011, chapter 5) C.Coltekin (slides)
Feb 29 Pronoun comprehension
Language change/diffusion
van Rij et al. (2009)
Nerbonne (2010)
J. van Rij
J. Nerbonne
Mar 7 Simulation of segmentation (1) Christiansen et al. (1998); Goldwater et al. (2009) C. Hoogwerf
J. Kurvers
Mar 14 Simulation of segmentation (2) Venkataraman (2001)
Goldwater et al. (2009)
C. Tsoukala
J. Kurvers
Mar 20 Language change/diffusion
Project discussion
Holman et al. (2007)
L. Szabó
Mar 21 Simulation of sentence processing &
Brouwer et al. (2010)
Johnson & Goldwater (2009)
H. Brouwer
M. Specken
Mar 27 Learning grammar & meaning Xu & Tenenbaum (2007)
Klein & Manning (2004)
A. Znaor
H. Mol
Mar 28 Project presentations


Students are expected to lead discussion on one or two papers. Contributions to discussions will also be part of the credit earned. A brief term paper (5-8pp), preferably describing and reporting results of a simulation of a new model of human behavior, or variation of an existing computational model. A brief summary and critique of an existing model or paper may also be a possible topic for the term paper.


The following is an extended lists of bibliography relevant to this course. We will not be reading all of them.

Batchelder, E. O. (2002). Bootstrapping the lexicon: A computational model of infant speech segmentation. Cognition, 83:167–206.

Borschinger, B. and Johnson, M. (2011). A particle filter algorithm for Bayesian wordsegmentation. In Proceedings of the Australasian Language Technology Association Workshop 2011, pages 10–18, Canberra, Australia.

Brent, M. R. (1999). An efficient, probabilistically sound algorithm for segmentation and word discovery. Machine Learning, 34(1-3):71–105.

Brent, M. R. and Cartwright, T. A. (1996). Distributional regularity and phonotactic constraints are useful for segmentation. Cognition, 61:93–125.

Christiansen, M. H., Allen, J., and Seidenberg, M. S. (1998). Learning to segment speech using multiple cues: A connectionist model. Language and Cognitive Processes, 13(2):221–268.

Çöltekin, Ç. (2011). Catching Words in a Stream of Speech: Computational simulations of segmenting transcribed child-directed speech. PhD thesis, University of Groningen.

Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14:179–211.

Goldwater, S., Griffiths, T. L., and Johnson, M. (2009). A Bayesian framework for word segmentation: Exploring the effects of context. Cognition, 112:21–54.

Hauser, M. D., Chomsky, N., and Fitch, W. T. (2002). The faculty of language: what is it, who has it, and how did it evolve? Science, 298(5598):1569–1579.

Johnson, M. and Goldwater, S. (2009). Improving nonparameteric Bayesian inference: experiments on unsupervised word segmentation with adaptor grammars. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 317–325.

Klein, D. and Manning, C. D. (2004). Corpus-based induction of syntactic structure: Models of dependency and constituency. In In Proceedings of the 42nd Annual Meeting of the ACL, pages 479–486.

Kuhl, P. K. (2004). Early language acquisition: cracking the speech code. Nature Reviews Neuroscience, 5(11):831–843.

Lappin, S. and Shieber, S. M. (2007). Machine learning theory and practice as a source of insight into universal grammar. Journal of Linguistics, 43(2):393–427.

Monaghan, P. and Christiansen, M. H. (2010). Words in puddles of sound: modelling psycholinguistic effects in speech segmentation. Journal of Child Language, 37(Special Issue 03):545–564.

Nerbonne, J. (2010). Measuring the diffusion of linguistic change. Philosophical Transactions of the Royal Society B: Biological Sciences, 365(1559):3821–3828.

Pullum, G. K. and Scholz, B. C. (2002). Empirical assessment of stimulus poverty arguments. Linguistic Review, 19(1/2):9.

Saffran, J. R., Aslin, R. N., and Newport, E. L. (1996). Statistical learning by 8-month old infants. Science, 274(5294):1926–1928.

Siskind, J. M. (1996). A computational study of cross-situational techniques for learning word-to-meaning mappings. Cognition, 61:39–91.

Solan, Z., Horn, D., Ruppin, E., and Edelman, S. (2005). Unsupervised learning of natural languages. Proceedings of the National Academy of Sciences, 102(33):11629–11634.

Thompson, S. P. and Newport, E. L. (2007). Statistical learning of syntax: The role of transitional probability. Language Learning and Development, 3(1):1–42.

Venkataraman, A. (2001). A statistical model for word discovery in transcribed speech. Computational Linguistics, 27(3):351–372.

Xu, F. and Tenenbaum, J. B. (2007). Word learning as Bayesian inference. Psychological Review, 114(2):245–272.

Yao, X., Ma, J., Duarte, S., and Çağrı Çöltekin (2009). An inference-rules based categorial grammar learner for simulating language acquisition. In Proceedings of the 18th Annual Belgian-Dutch Conference on Machine Learning, Tilburg.


Çağrı Çöltekin
Office hours: TBA