Statistical natural language processing
This course is an introduction to basic methods and applications in (statistical) natural language processing. The course introduces a wide range of topics in natural language processing, along with the related techniques from machine learning and related fields.
This page will contain up-to-date information on course schedule and material. Please also subscribe and follow the Moodle page of the course.
The evaluation will be based on three assignments and a final exam. The course is worth 9 ECTS credits.
Announcements
- 2017-07-21: Assignment 3 is available.
- 2017-07-10: Assignment 2 is available.
- 2017-06-02: Assignment 1 is available. Deadline: June 30, 12:00.
- 2017-05-12: Example solutions of the exercises can be found here
- 2017-04-19: website is up.
Reading material
-
Daniel Jurafsky and James H. Martin (2009) Speech and Language
Processing: An Introduction to Natural Language Processing,
Computational Linguistics, and Speech Recognition. Pearson
Prentice Hall, second edition (JM)
chapters from 3rd edition draft (JM3) -
Trevor Hastie, Robert Tibshirani, and Jerome Friedman (2009),
The Elements of Statistical Learning: Data Mining, Inference, and
Prediction. Springer-Verlag, second edition. (HTF)
available online
Course outline (tentative!)
Week | Monday | Wednesday | Friday |
---|---|---|---|
01 | Apr 17 No class |
Apr 19 Introduction / organization Reading: JM Ch. 1 slides, handout (8up) |
Apr 21 Python tutorial (1) exercises |
02 | Apr 24 Mathematical preliminaries slides, handout |
Apr 26 Probability theory slides, handout |
Apr 28 Python tutorial (2) |
03 | May 01 No class |
May 03 Information theory slides, handout |
May 05 exercises |
04 | May 08 Statistical models Reading: HTF Ch.1 slides, handout |
May 10 N-gram language models (1) Reading: JM Ch.4 slides, handout |
May 12 exercises, data |
05 | May 15 Machine learning intro (1) Reading: HTF Ch.1 & 3.2 & 3.4 slides, handout |
May 17 N-gram language models (2) |
May 19 exercises |
06 | May 22 exercises (cont.) |
May 24 Machine learning intro (2) Reading: JM 6.6 (JM3 Ch.7), HTF 4.4 slides, handout |
May 26 N-gram language models (3) |
07 | May 29 Tokenization, normalization, segmentation slides, handout |
May 31 Machine learning intro (3) slides, handout |
Jun 02 assignment 1, data |
Jun 05 - Jun 09: no class | |||
08 | Jun 12 POS tagging Reading JM Ch.5 (JM3: Ch.10) slides, handout |
Jun 14 Sequence learning Reading JM Ch.6 (JM3: Ch.9) slides, handout |
Jun 16 exercises, data |
09 | Jun 19 Neural networks (1) slides, handout |
Jun 21 Neural networks (2) |
Jun 23 exercises (cont.) |
10 | Jun 26 Parsing: introduction Reading: JM Ch.13 (JM3 Ch.12) slides, handout |
Jun 28 Statistical constituency parsing Reading: JM Ch.14 (JM3 Ch.13) slides, handout |
Jun 30 exercises (cont.) |
11 | Jul 03 Statistical dependency parsing Reading: JM3 Ch.14 slides, handout |
Jul 05 Unsupervised learning slides, handout |
Jul 07 Exercises |
12 | Jul 10 Distributed representations Reading: JM3 Ch.15&16 slides, handout |
Jul 12 Distributed representations (cont.) |
Jul 14 Exercises |
13 | Jul 17 Text classification slides, handout |
Jul 19 Summary |
Jul 21 Exercises |
14 | Jul 24 Summary |
Jul 26 Exam |
Jul 28 Exam discussion & exercises, data |
Contact
- Instructor: Çağrı Çöltekin
<ccoltekin@sfs.uni-tuebingen.de>,
Willemstr. 19, room 1.09
Office hours: Wednesday 10:00 - 12:00 - Tutor: Kuan Yu <kuan.yu@student.uni-tuebingen.de>