## Machine learning for computational linguists

Methods form machine learning are indispensable tools for
computational studies of language. This seminar covers some of the
important concepts and a number of prominent machine learning methods
ranging from early foundational methods to current state-of-the-art
techniques. Objectives of the course are two-fold. First, the
knowledge gained during the course will aid the students in
understanding the literature on computational linguistics and related
fields where majority of work includes applications of machine learning
methods. Second, after completing this course, students should be
able to choose the right machine learning techniques and apply them
correctly in their work.

The course assumes basic programming skills and ability process linguistic data (the 'Text Technology' course or equivalent coursework or experience is required). Although our focus will be on intuitive explanations and practical exercises, the students should be prepared to digest some mathematical notation. Some of the foundational topics, such as probability theory and statistics, will be introduced during the first lectures.

The evaluation will be based on assignments during the semester and a term project with an associated term paper. The course is worth 9 ECTS credits.

## Announcements

- 2016-08-30: The deadline for the term papers is 2016-09-15.

## Course outline (**tentative!)**

Date | Subject | Reading |
---|---|---|

Apr 12 | Introduction [slides] [handout] | Hastie et al. 2009, Chapter 1 |

Apr 14 | Background: a refresher on linear algebra [slides] [handout] | A short reference by Ivan Savov |

Apr 19 | Background: probability and information theory [slides] [handout] | None |

Apr 21 | Probability and information theory (2) | None |

Apr 26 | Regression [slides] [handout] | James et al. (2013) §3.1 |

April 28 | Linear regression (2) [data] | James et al. (2013) §3.2 |

May 3 | Classification: introduction, logistic regression [slides] [handout] | James et al. (2013) §4.1-4.3 |

May 10 | Machine learning basics: bias, variance, over-/under-fitting, regularization, cross validation ... [slides] [handout] | James et al. (2013) §5.1&6.2 |

May 12 | Exercises [tips] | None |

May 24 | Unsupervised learning 1: clustering [slides] [handout] | James et al. (2013) §10.1-10.3 |

May 31 | Unsupervised learning 2: PCA | None |

Jun 2 | Exercises: clustering, PCA [Exercises] [tips] [data] | None |

Jun 7 | Neural Networks: Perceptron, MLP [slides] [handout] | MacKay 2003 §38&39 |

Jun 9 | Exercises (contd.) | |

Jun 14 | Distributed representations [slides] [handout] | |

Jun 16 | Exercises: Distributed representations [Exercises] | |

Jun 21 | Deep learning 1: introduction [slides] [handout] | |

Jun 23 | Exercises [tips] | |

Jun 28 | Convolutional neural networks [slides] [handout] | |

Jun 30 | Exercises | |

Jul 5 | Recurrent neural networks [slides] [handout] | |

Jul 7 | Exercises | |

Jul 12 | Autoencoders, deep learning summary [slides] [handout] | |

Jul 14 | Exercises | |

Jul 19/21 | Summary & term paper/project discussion [slides] [handout] |