Course page: statistical natural language processing

Statistical natural language processing

This course is an introduction to basic methods and applications in (statistical) natural language processing. The course introduces a wide range of topics in natural language processing, along with the related techniques from machine learning and related fields.

This page will contain up-to-date information on course schedule and material. Please also subscribe and follow the Moodle page of the course.

The evaluation will be based on three assignments and a final exam. The course is worth 9 ECTS credits.

Announcements

2017-07-21: Assignment 3 is available.
2017-07-10: Assignment 2 is available.
2017-06-02: Assignment 1 is available. Deadline: June 30, 12:00.
2017-05-12: Example solutions of the exercises can be found here
2017-04-19: website is up.

Reading material

Daniel Jurafsky and James H. Martin (2009) Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Pearson Prentice Hall, second edition (JM)
chapters from 3rd edition draft (JM3)
Trevor Hastie, Robert Tibshirani, and Jerome Friedman (2009), The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer-Verlag, second edition. (HTF)
available online

Course outline (tentative!)

Week	Monday	Wednesday	Friday
01	Apr 17 No class	Apr 19 Introduction / organization Reading: JM Ch. 1 slides, handout (8up)	Apr 21 Python tutorial (1) exercises
02	Apr 24 Mathematical preliminaries slides, handout	Apr 26 Probability theory slides, handout	Apr 28 Python tutorial (2)
03	May 01 No class	May 03 Information theory slides, handout	May 05 exercises
04	May 08 Statistical models Reading: HTF Ch.1 slides, handout	May 10 N-gram language models (1) Reading: JM Ch.4 slides, handout	May 12 exercises, data
05	May 15 Machine learning intro (1) Reading: HTF Ch.1 & 3.2 & 3.4 slides, handout	May 17 N-gram language models (2)	May 19 exercises
06	May 22 exercises (cont.)	May 24 Machine learning intro (2) Reading: JM 6.6 (JM3 Ch.7), HTF 4.4 slides, handout	May 26 N-gram language models (3)
07	May 29 Tokenization, normalization, segmentation slides, handout	May 31 Machine learning intro (3) slides, handout	Jun 02 assignment 1, data
Jun 05 - Jun 09: no class
08	Jun 12 POS tagging Reading JM Ch.5 (JM3: Ch.10) slides, handout	Jun 14 Sequence learning Reading JM Ch.6 (JM3: Ch.9) slides, handout	Jun 16 exercises, data
09	Jun 19 Neural networks (1) slides, handout	Jun 21 Neural networks (2)	Jun 23 exercises (cont.)
10	Jun 26 Parsing: introduction Reading: JM Ch.13 (JM3 Ch.12) slides, handout	Jun 28 Statistical constituency parsing Reading: JM Ch.14 (JM3 Ch.13) slides, handout	Jun 30 exercises (cont.)
11	Jul 03 Statistical dependency parsing Reading: JM3 Ch.14 slides, handout	Jul 05 Unsupervised learning slides, handout	Jul 07 Exercises
12	Jul 10 Distributed representations Reading: JM3 Ch.15&16 slides, handout	Jul 12 Distributed representations (cont.)	Jul 14 Exercises
13	Jul 17 Text classification slides, handout	Jul 19 Summary	Jul 21 Exercises
14	Jul 24 Summary	Jul 26 Exam	Jul 28 Exam discussion & exercises, data

Contact

Instructor: Çağrı Çöltekin <ccoltekin@sfs.uni-tuebingen.de>, Willemstr. 19, room 1.09
Office hours: Wednesday 10:00 - 12:00
Tutor: Kuan Yu <kuan.yu@student.uni-tuebingen.de>