01-12-2015, 23:00

Note: This version of the exercises are under a reorganization/rewrite process. The older but more complete version can be found found at http://coltekin.net/cagri/R.old/.

This is a hands-on tutorial on R, a powerful statistical analysis software. The tutorial is prepared for the course Seminar in Methodology and Statistics taught by John Nerbonne at the University of Groningen.

The aim of the exercises is to provide a hands-on tutorial on some statistical analysis procedures that are common in various branches of linguistics. This tutorial assumes that you are familiar with basic statistical concepts. However, no initial knowledge of R is assumed.

Any suggestions and/or corrections are welcome.

HTML version of this tutorial makes use of MathML. Too see the mathematical formulas correctly you should use a MathML capable browser. Recent versions of Firefox works out of the box, for other browsers you may need additional plugins. Alternatively, you can download and use the PDF version of the complete exercise set. The PDF version is also useful if you prefer to have a printed version of the exercises.

1 Starting R and ﬁnding your way around

1.1 Getting help

1.2 Doing simple calculations with R

1.3 Variables

1.4 Vectors in R

Exercises

2 Basic data exploration and inference

2.1 Summarizing and visualizing one-dimensional data

2.2 Summarizing and visualizing two-dimensional data

2.3 Simple inference

3 Linear regression: a ﬁrst introduction

3.1 Some preliminaries

3.2 Some model diagnostics

4 Linear models with categorical predictors

4.1 Comparing two means

4.2 Checking assumptions of t test and ANOVA

4.3 Single ANOVA

4.4 Factorial ANOVA

4.5 If ANOVA assumptions are not met

4.6 T-test as a linear model

5 Repeated measures

5.1 Paired t-test

5.2 Repeated-measures ANOVA

6 Graphics

6.1 Basic graphics

6.2 Labels, axes, legends …

6.3 More than one graph on the same canvas

6.4 Writing your graphs to external ﬁles

6.5 Additional exercises

7 Regression again

7.1 Correlation

7.2 Least-squares linear regression

7.3 Model diagnostics

7.4 An example transformation

7.5 Predictions of a linear model

8 Multiple regression

8.1 Revisiting single regression (for the last time)

8.2 Multiple regression

8.3 Multicollinearity

8.4 ${r}^{2}$ and adjusted ${r}^{2}$

8.5 Model selection

9 General Linear Models

9.1 Categorical variables in regression

9.2 Other ways of coding categorical variables: contrasts

9.3 Mixing categorical and numeric predictors

10 Probability distributions

11 Logistic Regression

11.1 Regression and binomial response variables

11.2 Binomial data and generalized linear models

11.3 Binary data

11.4 Further exercises in model selection

11.5 More on logistic regression and GLMs

12 Multilevel / mixed-eﬀect models

12.1 Background

12.2 Random intercepts

12.3 Random slopes

12.4 Random intercepts or random slopes

12.5 Where are my p-values?

12.6 Multiple ﬁxed and random eﬀects

12.7 Crossed random eﬀects

12.8 Where to go from here?

References

A Answers

A.1 Starting R and ﬁnding your way around

A.2 Basic data exploration and inference

A.3 Linear regression: a ﬁrst introduction

A.4 Linear models with categorical predictors

A.5 Repeated measures

A.6 Graphics

A.7 Regression again

A.8 Multiple Regression

A.9 Probability distributions

A.10 Logistic Regression

A.11 Multilevel / mixed-eﬀect models

B Model formulas

1.1 Getting help

1.2 Doing simple calculations with R

1.3 Variables

1.4 Vectors in R

Exercises

2 Basic data exploration and inference

2.1 Summarizing and visualizing one-dimensional data

2.2 Summarizing and visualizing two-dimensional data

2.3 Simple inference

3 Linear regression: a ﬁrst introduction

3.1 Some preliminaries

3.2 Some model diagnostics

4 Linear models with categorical predictors

4.1 Comparing two means

4.2 Checking assumptions of t test and ANOVA

4.3 Single ANOVA

4.4 Factorial ANOVA

4.5 If ANOVA assumptions are not met

4.6 T-test as a linear model

5 Repeated measures

5.1 Paired t-test

5.2 Repeated-measures ANOVA

6 Graphics

6.1 Basic graphics

6.2 Labels, axes, legends …

6.3 More than one graph on the same canvas

6.4 Writing your graphs to external ﬁles

6.5 Additional exercises

7 Regression again

7.1 Correlation

7.2 Least-squares linear regression

7.3 Model diagnostics

7.4 An example transformation

7.5 Predictions of a linear model

8 Multiple regression

8.1 Revisiting single regression (for the last time)

8.2 Multiple regression

8.3 Multicollinearity

8.4 ${r}^{2}$ and adjusted ${r}^{2}$

8.5 Model selection

9 General Linear Models

9.1 Categorical variables in regression

9.2 Other ways of coding categorical variables: contrasts

9.3 Mixing categorical and numeric predictors

10 Probability distributions

11 Logistic Regression

11.1 Regression and binomial response variables

11.2 Binomial data and generalized linear models

11.3 Binary data

11.4 Further exercises in model selection

11.5 More on logistic regression and GLMs

12 Multilevel / mixed-eﬀect models

12.1 Background

12.2 Random intercepts

12.3 Random slopes

12.4 Random intercepts or random slopes

12.5 Where are my p-values?

12.6 Multiple ﬁxed and random eﬀects

12.7 Crossed random eﬀects

12.8 Where to go from here?

References

A Answers

A.1 Starting R and ﬁnding your way around

A.2 Basic data exploration and inference

A.3 Linear regression: a ﬁrst introduction

A.4 Linear models with categorical predictors

A.5 Repeated measures

A.6 Graphics

A.7 Regression again

A.8 Multiple Regression

A.9 Probability distributions

A.10 Logistic Regression

A.11 Multilevel / mixed-eﬀect models

B Model formulas