Introduction
This very popular test on the part of its simplicity was published in 1965 by Samuel Shapiro and Martin Wilk (Canadian statistician)1. It is particularly effective for samples with less than 50 observations2.
The principle
The test statistic is only the square of the correlation coefficient between the series of quantiles generated from the normal law and the empirical quantiles obtained from the data. Therefore, the more relates is close to 1 and the more our data follow a normal law.
Step 1: Assumptions
We ask the following assumptions:
- H0: Our data follow a normal law
- H1: Our data does not follow a normal law
Step 2: Calculate the practical value
The practical value calculation is done in several steps that we describe below.
- Classify n observations in order of increasing magnitude
- Calculate the differences between x(n-i + 1) -xi
- Read in Shapiro Wilk’s specific table the coefficients a relative to each value.
- Then calculate the numerator b2 = (Σ (ai * di))2
- Then calculate the denominator Z2 = Σ (xI – xcross)2
- Finally, calculate the practical value W which represents the ratio between B2 and Z2
Step 3: Calculating the critical value
The critical value of Shapiro Wilk is given in the exact tables of Shapiro Wilk for a given risk and a number of observations n relating to our situation.
Step 4: Interpretation
In view of our initial assumptions, the interpretation of the test is as follows:
Result | Statistical conclusion | Practical conclusion |
---|---|---|
Practical value ≥ Critical value | We retain H0 | Our data follow the normal distribution at given level of α risk. |
Practical value < Critical value | We reject H0 | Our data do not follow the normal law at given level of α risk . |
Source
1 – S. Shapiro, M. Wilk (1965) – An Analysis of variance test for noamlity
2 – R. Rafiq (2011) – Test of normality
Standard NF X 06-050 (1995) – Study of the normality of a distribution