3 Single and Factorial ANOVA

For this exercise, we will use a related but different problem.

We are interested in how children’s language development and their parent’s socio-economic-status (SES) are related. We also want to see whether gender have an effect on early language development. SES is a well known variable in socio-linguistics (in general in sociology) with levels ‘low’, ‘middle’ and ‘high’. Not surprisingly, gender has two levels ‘female’ and ‘male’.

We recruit 10 3-year-old kids for all combination of these two factors. We record mother-child dialogs in comparable situations, and calculate the MLU for each kid. The data looks like the following:

subject SESgenderMLU
1 low male 1.81
11 lowfemale 1.56
21middle male 2.96
31middlefemale 3.64
41 high male 3.02
51 highfemale 2.79

You can get the full dataset here.

Note: the data (somewhat) matches the results in the literature, but randomly generated for demonstration the analysis. Your results will not necessarily match the reality.

In all questions, assume an α-level of 0.05.

Exercise 3.1. Create two box plots, one describing MLU based on SES and the other by gender. Which differences do you expect to be statistically significant?

Exercise 3.2. Check normality of each group of SES using normal Q-Q plots. Do the distributions look normal?

Exercise 3.3. Perform an appropriate ANOVA to investigate the effect of SES on children’s MLU. Make sure to include the test for homogeneity of variances in the options dialog, and also include pairwise comparisons using Bonferroni correction.
  1. Is the homogeneity of variances assumption met? Which part of the output tells you this?
  2. Do you get a significant effect due to SES?
  3. Which levels (groups) of the SES differ from each other significantly?

Exercise 3.4. Perform a two-way ANOVA using both factors, SES and gender. Make sure to include the test for homogeneity, effect sizes, and the interaction plot in the SPSS output (TIP: interpretation is easier if you put gender on x-axis, and plot SES as separate lines).
  1. Do you see any interaction patterns between tow factors in the interaction graph?
  2. Which main effects are statistically significant?
  3. Given the significance of the interaction term, can you interpret the main effects directly?
  4. How do you interpret the effect sizes for the significant effects you have found.