Course page: Simulating Language Behavior

News

The regular meetings for this course are only on Wednesdays at 15:00 in room 1312.0118. We will not meet on Tuesdays.

This course on the use of simulations for studying language behavior may vary in focus, e.g. focusing one year on the acquisition of (one of) phonology, morphology or syntax, or another year on sentence processing and lexical access, or on the (social) diffusion of linguistic variation. The intention is that students familiarize themselves with the area under study in a given year, and thereby with the more general opportunities and limitations on the use of simulations in scientific inquiry.

This year's focus will be on simulating language acquisition in general, with an excursion to simulating language change. We will particularly focus on early word learning and segmentation. Students are encouraged to suggest areas of language behavior they are interested in as topics of discussion during this course.

Instructors:

Çağrı Çöltekin & John Nerbonne

Objectives

After completing this course, students should understand the use of simulations in cognitive sciences, be familiar with the modeling and simulation work in the area of language acquisition, and be able to (critically) read the recent research in the field.

Requirements

The course assumes familiarity with basic concepts of machine learning, but there will be time to review occasionally unfamiliar concepts as well. The course is aimed at students in research masters' programs, which assumes serious motivation and scientific maturity.

Course plan (tentative)

Date	Subject	Reading	Speaker
Feb 15	Introduction & organization	Coltekin (2011, chapter 2&3)	C.Coltekin (slides)
Feb 22	Computational simulation of language acquisition	Coltekin (2011, chapter 5)	C.Coltekin (slides)
Feb 29	Pronoun comprehension Language change/diffusion	van Rij et al. (2009) Nerbonne (2010)	J. van Rij J. Nerbonne
Mar 7	Simulation of segmentation (1)	Christiansen et al. (1998); Goldwater et al. (2009)	C. Hoogwerf J. Kurvers
Mar 14	Simulation of segmentation (2)	Venkataraman (2001) Goldwater et al. (2009)	C. Tsoukala J. Kurvers
Mar 20	Language change/diffusion Project discussion	Holman et al. (2007)	L. Szabó
Mar 21	Simulation of sentence processing & Segmentation	Brouwer et al. (2010) Johnson & Goldwater (2009)	H. Brouwer M. Specken
Mar 27	Learning grammar & meaning	Xu & Tenenbaum (2007) Klein & Manning (2004)	A. Znaor H. Mol
Mar 28	Project presentations

Assessment

Students are expected to lead discussion on one or two papers. Contributions to discussions will also be part of the credit earned. A brief term paper (5-8pp), preferably describing and reporting results of a simulation of a new model of human behavior, or variation of an existing computational model. A brief summary and critique of an existing model or paper may also be a possible topic for the term paper.

Bibliography

The following is an extended lists of bibliography relevant to this course. We will not be reading all of them.

Batchelder, E. O. (2002). Bootstrapping the lexicon: A computational model of infant speech segmentation. Cognition, 83:167–206.

Borschinger, B. and Johnson, M. (2011). A particle filter algorithm for Bayesian wordsegmentation. In Proceedings of the Australasian Language Technology Association Workshop 2011, pages 10–18, Canberra, Australia.

Brent, M. R. (1999). An eﬃcient, probabilistically sound algorithm for segmentation and word discovery. Machine Learning, 34(1-3):71–105.

Brent, M. R. and Cartwright, T. A. (1996). Distributional regularity and phonotactic constraints are useful for segmentation. Cognition, 61:93–125.

Christiansen, M. H., Allen, J., and Seidenberg, M. S. (1998). Learning to segment speech using multiple cues: A connectionist model. Language and Cognitive Processes, 13(2):221–268.

Çöltekin, Ç. (2011). Catching Words in a Stream of Speech: Computational simulations of segmenting transcribed child-directed speech. PhD thesis, University of Groningen.

Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14:179–211.

Goldwater, S., Griﬃths, T. L., and Johnson, M. (2009). A Bayesian framework for word segmentation: Exploring the eﬀects of context. Cognition, 112:21–54.

Hauser, M. D., Chomsky, N., and Fitch, W. T. (2002). The faculty of language: what is it, who has it, and how did it evolve? Science, 298(5598):1569–1579.

Johnson, M. and Goldwater, S. (2009). Improving nonparameteric Bayesian inference: experiments on unsupervised word segmentation with adaptor grammars. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 317–325.

Klein, D. and Manning, C. D. (2004). Corpus-based induction of syntactic structure: Models of dependency and constituency. In In Proceedings of the 42nd Annual Meeting of the ACL, pages 479–486.

Kuhl, P. K. (2004). Early language acquisition: cracking the speech code. Nature Reviews Neuroscience, 5(11):831–843.

Lappin, S. and Shieber, S. M. (2007). Machine learning theory and practice as a source of insight into universal grammar. Journal of Linguistics, 43(2):393–427.

Monaghan, P. and Christiansen, M. H. (2010). Words in puddles of sound: modelling psycholinguistic eﬀects in speech segmentation. Journal of Child Language, 37(Special Issue 03):545–564.

Nerbonne, J. (2010). Measuring the diﬀusion of linguistic change. Philosophical Transactions of the Royal Society B: Biological Sciences, 365(1559):3821–3828.

Pullum, G. K. and Scholz, B. C. (2002). Empirical assessment of stimulus poverty arguments. Linguistic Review, 19(1/2):9.

Saﬀran, J. R., Aslin, R. N., and Newport, E. L. (1996). Statistical learning by 8-month old infants. Science, 274(5294):1926–1928.

Siskind, J. M. (1996). A computational study of cross-situational techniques for learning word-to-meaning mappings. Cognition, 61:39–91.

Solan, Z., Horn, D., Ruppin, E., and Edelman, S. (2005). Unsupervised learning of natural languages. Proceedings of the National Academy of Sciences, 102(33):11629–11634.

Thompson, S. P. and Newport, E. L. (2007). Statistical learning of syntax: The role of transitional probability. Language Learning and Development, 3(1):1–42.

Venkataraman, A. (2001). A statistical model for word discovery in transcribed speech. Computational Linguistics, 27(3):351–372.

Xu, F. and Tenenbaum, J. B. (2007). Word learning as Bayesian inference. Psychological Review, 114(2):245–272.

Yao, X., Ma, J., Duarte, S., and Çağrı Çöltekin (2009). An inference-rules based categorial grammar learner for simulating language acquisition. In Proceedings of the 18th Annual Belgian-Dutch Conference on Machine Learning, Tilburg.

Contact

Çağrı Çöltekin
Office hours: TBA

Computational Simulations of Language Behavior LIX015M05, 2011/12, semester IIa