Introduction
The Kruskal-Wallis test is the generalization of the test of Wilcoxon – Mann Whitney For a number of samples greater than 2. It was developed in the years 19501, initially as an alternative to the ANOVA in the event that the hypothesis of normality is not acceptable. It is used to test whether k samples come from the same population, or from populations with identical characteristics, within the meaning of a position parameter (the position parameter is conceptually close to the median, but the Kruskal-Wallis test Takes into account more information than the position to the only meaning of the median).
The principle
Like any non-parametric test, the Kruskal Wallis test compares rows of data. It can thus compare averages, frequencies or even variances that we will process as ranks.
Step 1: Assumptions
As each sample is translated into class and frequency, each distribution is compared to determine whether one or more differ. The following assumptions are posed:
- H0: Distributions are equal
- H1: Distributions are different
Step 2: Calculate the sum of the rows per sample
As well as the test of Wilcoxon – Mann Whitney, the statistic of Kruskal and Wallis uses the sum of the ranks. A new variable is introduced which is the sum of the ranks of each sample. This has 2 consequences:
- The distribution of the data necessarily becomes symmetrical regardless of the initial distribution. Through this transformation in line, we find a normal law.
- The impact of Aberrant points is reduced see canceled.
2.1 Identify the rank of each value
The rank of each of the values is given in relation to the set of values of the 2 samples. The complexity lies in the case where we have ex-aequo. For this, the method of the middle ranks is used: they are given the average value of their ranks.
For example :
- If we have 2 equal values that take the 8th and 9th place , then we give them the rank 8.5.
- If we have 3 equal values, which take 10, 11 and 12th Place, then we give them the rank of 11.
2.2 Calculating the sum of the rows for each of the samples
SRk = sum of rows of k samples
Step 3: Practical value
In calculating the practical value, we see similarities with the calculation of the Anova. Below, we find a simplified formula, but in the original formula, one is taken to calculate the inter-class variability, i.e. the dispersion of the averages of the samples around the global average.

- SRk : Sum of the rows of individuals in the sample K
- nk : sample size K
- N: Total individual number of all samples
Taking into account duplicates
In the event that we have a tie that is common to one or more samples, we need to adjust the practical value by taking them into account. The formula is as follows:

We choose from the table the case that interests us. We look for the number of samples (3, 4 or 5) and then the combination of the number of individuals per sample that interests us.
This allows us to identify the line concerning us.
Then we choose the column based on the α value that we have chosen.
For example, if we have 3 samples of respective sizes 5, 4 and 3, and we have chosen a value of 5% for risk, we get a critical value of 5.656.

Sample > number 5 and number of individual per sample > 5
The variable T follows a law of Χ2 for a number of dof of k – 1 degree of freedom (k being the number of samples). Since we have more than 2 samples, the notion of unilaterality cannot be applied in any meaningful way. For this reason, the critical value is calculated only in bilateral. The formula is as follows:
Critical value = CHIINV (α; K – 1)
With K the number of samples
Step 5: p-Value
The significance of the result is tested by calculating the p-Value. In our case, the practical value follows a law of Χ2 for k – 1 degree of freedom (k being the number of samples). In Excel, the formula is as follows:
P-Value = Chidist (practical value; K-1)
Step 6: Interpretation
Result | Statistical conclusion | Practical conclusion |
---|---|---|
Practical value < Critical value | We retain H0 | There is no significant difference between the different samples at risk α to be wrong |
Practical value > Critical value | We reject H0 | There is a significant difference between the different samples at risk α to be wrong |
Result | Statistical conclusion | Practical conclusion |
---|---|---|
p-value > α | We retain H0 | There is no difference between the samples with a risk of being wrong of p-value% |
p-Value ≤ α | We reject H0 | There is a difference between the samples with a risk of being wrong of p-value% |
Step 7: Identify the group that differs
In the case where the null hypothesis has been rejected and p-Value is less than the risk α, it is concluded that at least one measurement group differs from the others. The question is, which one of these groups is different from the others?
If we just want to have an idea that our group number is can be important (less than 5), a simple comparison per pair is enough. This is calculated for each combination of groups, the% difference via the formula:

If on the contrary, we have many different groups and our challenge in determining the group that differs is important, we use a Post-Hoc test.
Step 8: Calculate the difference level
Finally, the last stage of the Kruskal Wallis test, identify a level of difference between the group or groups that differ and the other groups
It is calculated using the following formula:

Source
1-W. H. Kruskal, W. Allen Wallis (1952) – Use ranks in one-criterion variance analysis.
P. Sprent (1992) – Practice of non-parametric statistics
P. Capéraà, B. Van Cutsen (1988) – Methods and models in non-parametric statistics
S. Champely (2004) – Statistics really applied to sport