17 LAB VII: Inference for numerical data (>2 samples)

When we have finished this Lab, we should be able to:

Learning objectives

Applying hypothesis testing
Compare more than two independent samples
Interpret the results

17.1 Introduction

The one-way analysis of variance (one-way ANOVA) or the non-parametric Kruskal-Wallis test are used to detect whether there are any differences between more than two independent (unrelated) samples.

Althought, these tests can detect a difference between several groups they do not inform about which groups are different from the others. At first sight we might clarify the question by comparing all groups in pairs with t-tests or Mann-Whitney U tests. However, that procedure may lead us to the wrong conclusions (known as multiple comparisons problem).

Why is this procedure inappropriate? Quite simply, because we would be wrongly testing the null hypothesis. Each comparison one conducts increases the likelihood of committing at least one Type I error within a set of comparisons (famillywise Type I error rate).

This is the reason why, after an ANOVA or Kruskal-Wallis test concluding on a difference between groups, we should not just compare all possible pairs of groups with t-tests or Mann-Whitney U tests. Instead we perform statistical tests that take into account the number of comparisons (post hoc tests). Some of the more commonly used ones are Tukey test, Games-Howell test, and Bonferroni correction.

17.2 One-way Analysis of Variance (ANOVA)

One-way analysis of variance, usually referred to as one-way ANOVA, is a statistical test used when we want to compare several means. We may think of it as an extension of Student’s t-test to the case of more than two samples.

Opening the file

Open the dataset named “dataDWL” from the file tab in the menu:

The dataset “dataDWL” has 60 participants and includes two variables (Figure 17.1). The numeric WeightLoss variable and the Diet variable (with levels A, B, C and D).

Research question

Consider the example of the variations between weight loss according to four different types of diet (A, B, C, and D). The question that may be asked is: does the average weight loss (units in kg) differ according to the diet?

Hypothesis Testsing for the ANOVA test

Null hypothesis and alternative hypothesis

\(H_0\): all group means are equal (the means of weight loss in the four diets are equal: \(\mu_{A} = \mu_{B} = \mu_{C} = \mu_{D}\))
\(H_1\): at least one group mean differs from the others (there is at least one diet with mean weight loss different from the others)

Assumptions

Check if the following assumptions are satisfied

The dependent variable, WeightLoss, should be approximately normally distributed for all groups
The data in groups have similar variance (also named as homogeneity of variance or homoscedasticity)

A. Explore the descriptive characteristics of distribution for each group and check for normality

The distributions can be explored visually with appropriate plots. Additionally, summary statistics and significance tests to check for normality (e.g., Shapiro-Wilk test) and for equality of variances (e.g., Levene’s test) can be used.

On the Jamovi top menu navigate to

flowchart LR
  A(Analyses) -.-> B(Exploration) -.-> C(Descriptives)

as shown below in ?fig-diet0.

In the menu at the top, choose Analyses > Exploration > Descriptives.

The Descriptives dialogue box opens. Drag the variable WeightLoss into the Variables field and split it by the Diet variable. Additionally, select Variable across rows, as shown below (Figure 17.2):

Figure 17.2: Split the variable **WeightLoss** by **Diet** group and select Variables across rows.

We can now select the relevant descriptive statistics such as Percantiles, Skewness, Kurtosis and the Shapiro-Wilk test from the Statistics section (Figure 17.3):

Figure 17.3: In the Statistics section select the descriptive statistics of interest.

Once we have selected our descriptive statistics, a table will appear in the output window on our right-hand side, as shown below (Figure 17.4):

Figure 17.4: Descriptive statistics of **WeightLoss** by **Diet** group (click on figure to zoom in).

The means are close to medians and the standard deviations are also similar indicating normal distributions for all groups. Additionally, both shape measures, skewness and (excess) kurtosis, have values in the acceptable range [-1, 1] which indicate symmetric and mesokurtic distributions, respectively.

The Shapiro-Wilk tests of normality suggest that the data for the WeightLoss in all groups are normally distributed (p > 0.05 \(\Rightarrow H_0\) is not rejected).

Remember: Hypothesis testing for Shapiro-Wilk test for normality

\(H_{0}\): the data came from a normally distributed population.

\(H_{1}\): the data tested are not normally distributed.

If p − value < 0.05, reject the null hypothesis, \(H_{0}\).
If p − value ≥ 0.05, do not reject the null hypothesis, \(H_{0}\).

Then we can check the Density box from Histograms in the Plot section, as shown below (Figure 17.5):

Figure 17.5: In the Plots section select Density from Histograms.

A graph (Figure 17.6) is generated in the output window on our right-hand side:

Figure 17.6: The Density plot of **WeightLoss** for each **Diet**.

The above density plots show that the data are close to symmetry and the assumption of a normal distribution is reasonable for all diet groups.

B. Homogeneity of variance

The second assumption that should be satisfied is the homogeneity of variance. We observe in the summary table of Figure 17.4 that the standard deviations are similar (see also below the Levene’s test for equality of variances in Figure 16.11).

Run the ANOVA test

Perform ANOVA in Jamovi

We will perform ANOVA to test the null hypothesis that the mean WeightLoss is the same for all Diet groups.

On the Jamovi top menu navigate to

flowchart LR
  A(Analyses) -.-> B(ANOVA) -.-> C(One-Way ANOVA)

as shown below in ?fig-diet4.

Conducting ANOVA test in Jamovi. In the menu at the top, choose Analyses > ANOVA > One-Way ANOVA.

The One-Way ANOVA dialogue box opens. Drag and drop the WeightLoss to Dependent Variables field and the Diet to Grouping Variable, as shown below Figure 17.7:

Figure 17.7: One-Way ANOVA dialogue box. Drag the **WeightLoss** into the Dependent Variables field and the **Diet** into the Grouping Variable field.

We observe that we can select between the following two Tests: Welch’s test (the default), or Fisher’s test. At the moment, we keep the default choice. Moreover, from Additional Statistics check the Descriptive and Descriptive plots boxes. Finally, from Assumption Checks tick the Homogeneity test box. We will end up with the following screen:

Figure 17.8: Additional statistics and tests.

First, we look at the table of Levene’s test for equality of variances (Figure 17.9):

Remember: Hypothesis testing for Levene’s test for equality of variances

\(H_{0}\): the variances of WeightLoss in all groups are equal (\(σ^2_A=σ^2_B=σ^2_C=σ^2_D\))

\(H_{1}\): the variances of WeightLoss differ between groups (\(σ^2_i\neq σ^2_j\), where \(i,j= A, B, C, D\) and \(i\neq j\))

If p − value < 0.05, reject the null hypothesis, \(H_{0}\).
If p − value ≥ 0.05, do not reject the null hypothesis, \(H_{0}\).

Since p = 0.583 > 0.05, the \(H_0\) of the Levene’s test is not rejected and we have to perform the Fisher’s test which assumes equal variances (Figure 17.10). So, let’s tick on the Assume equal (Fisher’s) box. (NOTE: If the \(p \geq 0.05\), then the population variances of WeightLoss in all groups are assumed equal).

ANOVA test (Welch’s option)

If the assumption of equal variances is not satisfied (Levene’s test gives p < 0.05, reject \(H_0\)), the Welch’s option of ANOVA should be used from the available Tests in Jamovi (Figure 17.10).

Next, we can inspect again the results in the group descriptives table (Figure 17.11) and pertinent plots (Figure 17.11):

Figure 17.11: Group descriptive statistics.

Figure 17.12: Plot of mean (95% CI) of WeightLoss by Diet.

From the Figure 17.12 we observe that the participants following the diet C have on average the higher weight loss.

Finally, we present the results of the Fisher’s ANOVA test in the table of Figure 17.13:

Figure 17.13: The results of the Fisher’s ANOVA test.

In Figure 17.13, F= 6.12 indicates the F-statistic:

\[F= \frac{variation \ between \ sample \ means}{variation \ within \ the \ samples}\]

Note that we compare this value to an F-distribution (F-test). The degrees of freedom in the numerator (df1) and the denominator (df2) are 3 and 56, respectively.

The p-value=0.001 is less than 0.05 (reject \(H_0\) of the ANOVA test). There is at least one diet with mean weight loss which is different from the other means.

Run post-hoc tests

Perform post-hoc tests

A significant one-way ANOVA is generally followed by post-hoc tests to perform multiple pairwise comparisons between groups. From the One-Way ANOVA dialogue box click on Post-Hoc Tests section. We have got the following two options:

Games-Howell (unequal variances)
Tukey (equal variances)

Based on the result of Levene’s test (p = 0.583 > 0.05, the \(H_0\) is not rejected) (Figure 17.9), we should select the Tukey (equal variances) post-hoc test. Additionally, check the Flag significant comparisons as shown below (Figure 17.14):

Figure 17.14: Select the appropriate post-hoc test. For equal variances the Tukey post hoc test.

Once we have selected our post-hoc test, a table will appear in the output window on our right-hand side, as shown below (Figure 17.15):

Figure 17.15: Table with the results of the Tukey post-hoc test.

Interpretation

Pairwise comparisons were carried out using the method of Tukey and the adjusted p-values were calculated. The weight loss following diet C is significantly larger compared to diet A (mean difference = 2.93 kg, p=0.005 <0.05) or diet B (mean difference = 3.21 kg, p=0.002 <0.05).

17.3 Kruskal-Wallis test

The Kruskal-Wallis test is a rank-based non-parametric alternative to the one-way ANOVA and an extension of the Mann-Whitney U test to allow the comparison of more than two independent groups. It’s usually recommended when the normality assumption of one-way ANOVA test is not met (non-normal distributions).

Opening the file

Open the dataset named “dataVO2” from the file tab in the menu:

The dataset “dataVO2” includes 30 participants and has two variables. The numeric VO2max variable and the sport variable (with levels rowers, runners, and triathletes).

Research question

We want to compare the VO2max in three different sports (runners, rowers, and triathletes).

Hypothesis Testsing for the Kruskal-Wallis test

Null hypothesis and alternative hypothesis

\(H_0\): the distribution of \(VO2_{max}\) is the same in all groups (the medians of \(VO2_{max}\) in the three sports are the same)
\(H_1\): there is at least one group with \(VO2_{max}\) distribution different from the others (there is at least one sport with median \(VO2_{max}\) different from the others)

Explore the descriptive characteristics of distribution for each group and check for normality

The distributions can be explored visually with appropriate plots. Additionally, summary statistics and significance tests to check for normality (e.g., Shapiro-Wilk test).

On the Jamovi top menu navigate to

flowchart LR
  A(Analyses) -.-> B(Exploration) -.-> C(Descriptives)

as shown below in ?fig-athletes0.

The Descriptives dialogue box opens. Drag the variable VO2max into the Variables field and split it by the VO2max variable. Additionally, select Variable across rows, as shown below (Figure 17.17):

Figure 17.17: The Descriptives dialogue box. Split the variable VO2max by sport group and select Variable across rows.

We can now select the relevant descriptive statistics such as Percantiles, Skewness, Kurtosis and the Shapiro-Wilk test from the Statistics section (Figure 17.3):

Select the descriptive statistics of interest from the Statistics section.

Once we have selected our descriptive statistics, a table will appear in the output window on our right-hand side, as shown below (Figure 17.18):

Figure 17.18: Descriptive statistics of VO2max by sport group (click on the image to zoom in).

The sample size is relative small. Moreover, the skewness=1.51 and the kurtosis=1.6 for the triathletes are out of the range of -1 and 1 indicating non-normal distribution. We can also see that the data for the triathletes is not normally distributed (p=0.023 <0.05) according to the Shapiro-Wilk test.

Next, in the Plots section we can visualize the distribution of VO2max for the three groups. From Box Plots choices tick the boxes of Box Plot, Violin, Data and Mean, as shown below (Figure 17.19):

Figure 17.19: In the Plot section tick the boxes of Box Plots.

A graph (Figure 17.20) is generated in the output window on our right-hand side:

The above figure shows that the data in triathletes group have some outliers. Additionally, we can observe that the runners group seems to have the largest VO2max.

By considering all of the information together (small samples, graphs, normality test) the overall decision is against of normality.

Run the Kruskal-Wallis test

Perform Kruskal-Wallis test in Jamovi

We will perform Kruskal-Wallis test to decide about the null hypothesis that the median \(VO2_{max}\) is the same for all sports.

On the Jamovi top menu navigate to

flowchart LR
  A(Analyses) -.-> B(ANOVA) -.-> C(Non-Parametric \nOne-Way ANOVA  \nKruskal-Wallis)

as shown below in ?fig-athletes5.

Conducting Kruskal-Wallis test in Jamovi. In the menu at the top, choose Analyses > ANOVA > One-Way ANOVA Kruskal-Wallis.

The One-Way ANOVA (Non-parametric) dialogue box opens. Drag and drop the numeric variable VO2max to Dependent Variables and the independent variable sport to Grouping Variable, as shown below Figure 17.21:

Figure 17.21: One-Way ANOVA (Non-parametric) dialogue box. Drag and drop the VO2max into Dependent Variables and the sport into Grouping Variable.

The results of the Kruskal-Wallis test are presented in the table of the Figure 17.22:

Figure 17.22: The results of the Kruskal-Wallis test.

The p-value (<0.001) is lower than 0.05. There is at least one sport in which the VO2max is different from the others.

Run post-hoc tests

Perform post-hoc tests

A significant Kruskal-Wallis test is generally followed by post-hoc tests to perform multiple pairwise comparisons between groups. Jamovi provides a pertinent test named DSCF pairwise comparisons:

Figure 17.23: Check the DSCF post-hoc test.

A table appears in the output window on our right-hand side, as shown below (Figure 17.24):

Figure 17.24: Table with the results of the post-hoc tests.

Interpretation

Pairwise comparisons were carried out using the method of Dwas-Steel-Critchlow-Flinger and adjusting p-values were calculated. The runners’ VO2max (median= 77.2, IQR=[74.2, 79.5] mL/kg/min) seems to be significantly higher than rowers (69.6 [67.8, 73.1] mL/kg/min) (p=0.005 <0.05) and triathletes (65.4 [64.0, 67.6] mL/kg/min) (p=0.003 <0.05).