The normal probability Plot is a graphical analysis to see if a data set is distributed in a ” normal ” way.
From our original data, we build a series of ” theoretical ” data through the normal law. By drawing a graph representing in X our original values and in Y our theoretical values, we should get a straight line if these are normal.
The interest of the tool is in its simplicity. Probably less accurate than the Hypothesis Testing, it has the main advantage of being in practice sufficiently effective to put in place the necessary investigations and conclude.
It is simply noted that this method is effective only from a number of points n greater than 101.
1-Calculate the centered and reduced value of our data
To do this, the following formula is applied to each of the values: (X-Xcross)/σ
Then one represents on a graph cloud of points with:
- In X: The value of our initial data
- In Y: the centered/reduced value of our data
If our data follows a normal law, the point cloud must represent a right.
2-Calculate the confidence interval
We then calculate the confidence interval that is not the normal distribution function of the law for our data. This interval form 2 hyperboles, one below and one above the line that represents our data. For this:
- Sort the data from the smallest to the largest
- For each of the values, a theoretical value (cumulative frequency) is calculated according to the following distribution function2 : F = (i – 0.375)/(n + 0.25)
- Successive quantiles (Z value) are calculated using the reduced centered normal law3 (function NORM.S.INV on Excel)
- Finally, we square each of our obtained values and add them.
All that remains is to calculate the confidence interval using the following formula:
- n: the number of stitches
- σ: Standard deviation of source data
- Zi: Quantile of our data according to the normal distribution function of the law
- z: The quantile of the normal law for a risk of α/2 for n – 2 degree of freedom. Most generally we will take a confidence level of 95%, or a z value of 1.96.
If our data follows a normal law, our data must be in a straight line. Straigh line which is between the two parables representing the confidence level of our study.
|Some points come out of the imaginary straight line.||We are probably in the case of outliers.||Investigate if it is not a measurement error.
If yes, redo the measurements or remove them from the study.
|There are several data that are shifted to the right.||The data distribution is non-normal and shifted to the right.|
Peut être également qu’il s’agit d’une répartition normale tronquée.
|Remove the data too far away and take the others considering that they follow a normal law.|
|There are several data that are shifted to the left.||Same as the previous case, just this time our data is shifted to the left.|
|One or both ends are below or above the imaginary straight line.||The distribution is "normal" but is "heavy tail", that is, our data have a normal distribution but slightly flattened.|
|There is ONE inflection point in relation to the imaginary line.||We probably have 2 groups of data with either the same variance or the same mean, and both following a normal distribution.||Look for the 2 groups of values.|
If the case we are facing does not match one of the above, then we will conclude that our data does not follow a normal law.
1-Standard NF X 06-050-Study of normality of a distribution
2 – G. Saporta (2006) – Data analysis and statistics
3 – R. Sneyers (1974) – on tests of normality
R. Rafiq (2011) – Normality Tests