3 Single and Factorial ANOVA

For this exercise, we will use a related but different problem.

We are interested in how children’s language development and their parent’s socio-economic-status (SES) are related. We also want to see whether gender have an effect on early language development. SES is a well known variable in socio-linguistics (more generally in sociology) with levels ‘low’, ‘middle’ and ‘high’. We use a binary gender variable with two levels ‘female’ and ‘male’.

We recruit 10 3-year-old kids for all combination of these two factors. We record mother-child dialogs in comparable situations, and calculate the MLU for each kid. The data looks like the following:

subject SESgenderMLU
1 low male 1.81
11 lowfemale 1.56
21middle male 2.96
31middlefemale 3.64
41 high male 3.02
51 highfemale 2.79

You can get the full dataset here.

Note: the data (somewhat) matches the results in the literature, but randomly generated for this demonstration. Your results will not necessarily match the reality.

In all questions, assume an α-level of 0.05.

Exercise 3.1. Create two box plots, one describing MLU based on SES and the other by gender. Which differences do you expect to be statistically significant?

Exercise 3.2. Check normality of each group of SES using normal Q-Q plots. Do the distributions look normal?

Exercise 3.3. Perform an appropriate ANOVA to investigate the effect of SES on children’s MLU. Make sure to include the test for homogeneity of variances in the options dialog, and also include pairwise comparisons using Bonferroni correction.
  1. Is the homogeneity of variances assumption met? Which part of the output tells you this?
  2. Do you get a significant effect due to SES?
  3. Which levels (groups) of the SES differ from each other significantly?

Exercise 3.4. Perform a two-way ANOVA using both factors, SES and gender. Make sure to include the test for homogeneity, effect sizes, and the interaction plot in the SPSS output (TIP: interpretation is easier if you put gender on x-axis, and plot SES as separate lines).
  1. Do you see any interaction patterns between two factors in the interaction graph?
  2. Which main effects are statistically significant?
  3. Can you interpret the main effects directly based on your finding about whether the interaction term is significant or not?
  4. How do you interpret the effect sizes for the significant effects you have found.