The normal law (from the Latin “norma” meaning “square”) At that time, the square was used to measure the rightness of a construction.In everyday language, normality took the meaning “to follow the rule, habits. .. “) is one of the pillars of statistics. Well known, it is used in all areas of statistics: finance, psychology, anatomy …. It summarizes many statistical distributions observed (the more data we have, the more statistical laws tend towards it: binomial, hypergeometric , Student …).
« Everyone believes, however, because the experimenters imagine that it is a theorem of mathematics, and mathematicians that it is an experimental fact ». Sentence of G. Lippmann (renowned mathematician) reported by Henri Poincaré (renowned mathematician) in 1896 about the normal law
It turns out that the shape of this diagram will look a lot like the bell curve above. There is an average value for the size of these children – some are larger, others smaller – and a good proportion are of a size that does not get too far from the average.
Its origin is in the calculation of the chances, also called law of the chance, for the study of the play of the pile or face towards 1730 (one will note that, well before, Pascal had already published on this law – 1654, and the binomial law was created by Jacob Bernoulli in a posthumous publication of 1713). We will be interested in the realization or not of a result. We play a hundred times head or tail. Everyone knows, with a well-balanced room, you get pretty much the same number of piles as faces, sometimes a little more face, sometimes a little more pile.
But how can we demonstrate that? Which gap between piles and faces can be considered reasonable ?
The first mathematician to ask the question is Pascal in 1654, and before him many others, for example the Chinese Yanh Hui (around 1261). But it will be Laplace who in 1786 will discover this curve and develop the law Normaleble.
The torsos of the Scots
In the 1820s, Quételet, a recognized mathematician, managed to convince the administration of the Kingdom of the Netherlands to build an astronomical observatory in Brussels. To prepare this project, he spent a few months at the Observatoire de Paris in 1823 to meet astronomers Alexis Bouvard, François Arago, Pierre Simon de Laplace, Joseph Fourier and Simeon Denis Poisson. This stay had a decisive importance on his career. He was then introduced to the use that astronomers made of calculating probabilities in the control of measurement errors in astronomy. Quetelet wondered whether human and social phenomena did not present the same regularities in their distribution as natural phenomena. By taking the measurements of French conscripts and analyzing those of 5,000 Scottish soldiers presented in 1817 in the journal Edinburg Medical Journal , Quételet found that the biometric data of the man, such as the weight, the size, the thoracic perimeter, were distributed along a normal curve. This is why he is considered one of the founders of anthropometry and biostatistics. He found that these data fluctuated around average values and that these average values tended to be constant. Quételet thus became one of the first (because Laplace, Fourier, Poisson and von Bortkiewicz, etc … had already applied the normal law to socio-economic data …) to use the normal curve otherwise than for the distribution of errors in astronomy and in physics. He then extended these notions to the set of physical characteristics by creating the notion of average man that he presented in 1835, in his book entitled About Man and Development of his faculties; Essay of a social physics.
The normal centered reduced law
The reduced normal centered law is a special case of the normal which transforms all the normal curves into a single standard curve, thus with easily measurable characteristics, by a change of variable, which allows us then to make predictions on the numerical values of initial normal curves.
This technique consists in subtracting from our value the average of the values and dividing by the standard deviation. This allows us to center the distribution of our data on the y-axis to make it symmetrical and easier to read and analyze. The formula is the following :
This Z score represents how many standard deviations an X value is relative to the mean.
The central limit theorem
This fundamental theorem of the theory of probability makes it possible to understand why in many concrete situations, the diagram describing the distribution of an extremely general random phenomenon converges towards a normal law.
Let’s take the experience of Pile ou Face: We throw the coin n times and we count the number of times we fall on pile. The theorem asserts that the diagram representing the probabilities of falling on k times stacked in a game of n throws approaches a bell when the number n of throws goes to infinity.
This is a central theorem of passing to the limit in probabilities (« der zentrale Grenzwertsatz of Wahrscheinlichkeitsrechnung ») and this expression of Pólya (1930) became oddly in English « Zentral Limit Theorem » and in french, Théorème limite central, or central limite, or de la limite centrale. This theorem relating to the game of pile or face can be generalized.
In its most general form, Theorem tells us that under certain conditions, the distribution of a sum or a difference of independent random variables, tends towards the normal law (in the same way as the law binomial or Student).
Download our file “ illustration of the central limit theorem ” illustrating this phenomenon to convince you.
This is why we meet these bells everywhere. As soon as a phenomenon is the sum of a large number of independent random causes, a bell appears. And that, regardless of the nature of the multiple random causes, which can quite follow another law of probability, like for example a law of pile or face which is Binomial when n is weak but tends towards the normal law when n is large. This is one of the most striking examples of phenomena of universality in mathematics: by adding a large number of hazards of which we know nothing, the distribution of the sum follows a law Normal.
This theorem is essential in the theory of errors and that is what Gauss was primarily interested in. If I measure the length of a table a large number of times with my decimetre, the distribution of the results will tend to be on a Normal distribution, and 95% of the results will be in a range of two standard deviations around the average. This confidence interval of two standard deviations is what physicists call “ The uncertainty of measurement ».
Below we represent 3 normal centered reduced law, all having the same average, but with different standard deviations :
In red : σ = 1
In blue : σ = 2
In green : 0.5
Thus, the higher the dispersion, the flatter the curve and vice versa.
The Galton board
Example of use
We are a pharmaceutical company and we produce syrup bottles of different capacities (100 ml, 200 ml …). The filler of our packaging line never fills at 100, 200 ml stack. There is a standard deviation that we do not know how to improve without changing the machine, an option that is not retained for the moment.
We wish, according to quality standards, that at least 95% of our syrup bottles contain more than 99ml for 100ml bottles and more than 198ml for 200ml bottles. The question is this :
On what capacity setting should I adjust our machine to ensure this condition ?
1. Data collection
First, we identify the statistical law that follows our equipment. For this, we fill 100 vials of 100ml with a setting of 100ml, and we calculate the standard deviation. We realize that our data follow a normal distribution with a standard deviation of 1,5ml.
2. Formalization of the problem
With regard to the statement, we look for the minimum setting value of X bar of our machine allowing us to have :
- For bottles of 100ml : P(X ≥ 99ml) ≥ 0.95.
- For bottles of 200ml : P(X ≥ 198ml) ≥ 0.95
3. Resolution of problem
We identify our adjustment value using the formula of the reduced normal centered law. This leads us to solve the following equation : Z = (X– Xbar) / σ
- Z : the normalized value of our probability. In our case, wishing a 95% probability to be above our limit value, we get the value -1.64485 (one-sided test since we wish since we have a minimum limit to meet).
- X : Our minimum value in question, 99 or 198 depending on the type of bottle.
- Xbarre : the value of the setting we want to identify.
- σ : the standard deviation of our equipment, here 1,5
From where we get the following equations :
- For bottles of 100ml : Xbar = X – Z * σ = 99 – (-1,64485*1,5) = 101,47
- For bottles of 200ml : Xbar = X – Z * σ = 198 – (-1,64485*1,5) = 200,47
For our 100ml bottles, if we want to ensure 95% that our bottles are filled to more than 99ml, we must set our equipment to 101.47ml.
Similarly, for our 200ml bottles, if we want to ensure 95% that our bottles are over 198ml, we must adjust our equipment to 200,47ml.
C. F. Gauss (1809) – Theoria motus corporum cœlestium in sectionibus conicis solem ambientium
B. Bru (2006) – La courbe de Gauss ou le théorême de de Bernouilli raconté aux enfants
E. Brian et M. Jaisson (2007) – Le sexisme de la première heure : hasard et sociologie
J. Fourier (1822) – Théorie analytique de la chaleur
W.Feller (1967) – An introduction to probability Theory and its applications
I.M. Gelfand, G.E.Šilov (1958) – Fonctions généralisées
Y Katznelson et S Mandelbrojt (1963) – Quelques classes de fonctions entières et le problème de Gelfand et Šilov
E. Lesigne (2001) – Pile ou Face, une introduction aux théorèmes limites du calcul des Probabilités