Starting from a sample to deduce the behavior of the population, the inferential statistic always incorporates a concept of confidence in the measures.
Introduction
The crucial question of inferential statistics: to what extent can we rely on the values estimated from a simple sample? At what level, I am sure of my conclusions?
The answer is obtained by calculating confidence intervals. Let’s specify immediately what a Confidence interval is not the interval in which the true value of the parameter is with certainty. In fact, the random variable can theoretically take all possible values within the limits of the laws of physics. The confidence interval actually represents the area in which is ” most likely “, and with a probability that one chooses, the true value (forever unknown) of the parameter that one studies in the population.
In its use, an interval is based on the calculation of a confidence threshold, a margin of error and a margin coefficient. These elements depend on:
- Of the variability of the characteristics that are measured
- The size of the sample : the bigger the accuracy, the greater the precision
- The sampling method chosen
Indicators to calculate a Confidence interval
The confidence threshold-s
Also called confidence level or confidence rate, it represents the level of confidence that one wishes to guarantee to the measure. For example, with a confidence threshold of 90%, this means 10% risk of being wrong. Generally, the best practice is to choose a confidence threshold of 95%.
Therefore, at the most the confidence threshold is large (so the margin coefficient – see below), at the most the sample size is large.
The margin coefficient
The margin coefficient is an indicator inferred directly from the confidence threshold via the table of the normal law (if n > 30) or the student table (if n < 30). The table below gives some examples for the most common values.
Confidence Rate S |
Margin coefficient if n > 30 |
80% |
1,28 |
85% |
1,44 |
90% |
1,645 |
95% | 1,96 |
96% | 2,05 |
98% | 2,33 |
99% |
2,575 |
Confidence interval on average
If we want to estimate the average of a population from a sample of it, we will have to estimate a Confidence interval. This one will allow us to say at what level I can be confident about the fact that the average population is included in the average interval calculated on the basis of the sample. The calculation of theConfidence interval depends on the size of the sample and the law that the variable follows. On the principle the formula is as follows:
- the lower limit of the interval = average of the sample – margin coefficient * Standard error of an average
- the upper limit of the interval = average of the sample + margin coefficient * Standard error of an average
The value of t will depend on the size of the sample:
- n > 30: margin coefficient of the normal law (called Z)
- n < 30: Student’s Law margin coefficient (called T) for N-1
Example
A bulb manufacturer wants to study the life of its production. For this, it grids 25 bulbs (therefore it is taken from the margin of the student’s law) and thus defines a normal distribution of average 860 hours and standard deviation 30. At a confidence level of 95%, the following interval is deducted:
- Lower limit: 847.6
- Upper limit: 872.4
Confidence interval for a proportion
One wishes to estimate the proportion of defective parts of a production from a sample. So we’re going to estimate a Confidence interval from the sample values. The calculation of this interval follows the principle of the following formula:
- the lower limit of the interval = average frequency – margin coefficient * Standard error of a percentage
- the upper limit of the interval = average frequency + margin coefficient * Standard error of a percentage
The value of the margin coefficient will depend on the size of the sample:
- n * p > 5: We take the margin coefficient of the normal law (called Z)
- n * p < 5: We take the margin coefficient of the poisson law (called μ)
Confidence interval for a standard deviation
One wishes to estimate the standard deviation of the machining diameter of a production from a sample. So we’re going to estimate a Confidence interval from the sample values. The calculation of this interval follows the principle of the following formula:
- the lower limit of the interval = standard deviation of the sample – margin coefficient * Typical deviation error
- the upper limit of the interval = standard deviation of the sample + margin coefficient * Typical deviation error
The value of the margin coefficient will depend on the size of the sample:
- N > 30: We take the margin coefficient of the normal law (called z)
- N < 30: We take the margin coefficient of the law of Student (called t)
Source
R. Veysseyre (2014) – Statistics and probability for engineers
J. C. Breton (2008) – Statistics
M. Garrett (2009) – Theory of estimation
P. Ardilly (1994) – Survey techniques
G. Saporta (1990) – Probabilities – data analysis
D. Schwartz (1996) – Statistical methods for physicians and biologists
F. Yates (1951) – Survey methods for censuses and surveys
Mr. R. Tekaya (2006) – CalculatingConfidence interval for the average in an asymmetric population