[Total: 0    Average: 0/5]
The McNemar test allows to compare 2 populations that can only take 0 or 1:0 values being the non presence of a character, with 1 being the presence of a character.

Introduction

It was Dr. Quinn McNemar who created this test in 1947. At the time, he developed it in the context of genetic tests on the relationship of transmission imbalance (this is the study of non-random associations of alleles with 2 or more genes that come from the same chromosome)1.

The McNemar test is a non-parametric test whose purpose is to compare 2 populations that can only take 0 or 1:0 values being the non presence of a character, with 1 being the presence of a character. The 2 comparison populations are the same (test on paired data). The McNemar test is particularly used for the ” before – after ” paired data2 :

The principle

It is considered that one wants to compare the occurrence of an event at two different times on the same population of n individuals:

  1. The first step is to measure the number of appearances of the event being sought.
  2. In a second step, we re-perform this measure on these same individuals to compare the results.
  AfterTotal
01
Before0ABA + B
1CDC + D
TotalA + CB + Dn

Step 1-Assumptions

Let’s assume that π is the probability of the occurrence of our event. Because this test is only bilateral, the test hypotheses are:

H0: π1 = π2 : The probabilities of the event are the same

H1: π1 ≠ π2 : The probabilities of the event are different

Step 2-Pick up the data

The test is based on a contingency table.

Example:

We produce cough syrups. We currently have a large problem with pleated labels. After study, we know that a certain defect in the Shape of the vial is at the origin of the defect. Not knowing how to ensure a sufficiently reliable production at the vial level, we want to study the possibility of improving our equipment to absorb the defect and eradicate the problem.

We have selected 40 vials that have been manufactured in ” problem ” mussels. We put in place 2 tests:

  • A first with the current features of the equipment.
  • A second, with the same vials that we have dice-labelled and we eat in our equipment that we then improved.

The following contingency table is built:

  AfterTotal
Without defectWith defect
BeforeWithout defect404
With defect35136
Total39140

The table binds in the following way:

  • During the first Test 36 vials had the defect.
  • Only 1 of these 36 vials with defects always have a defect in the second Test.

Step 3-Practical value

The test statistic will consist of comparing the number of occurrences of the event between the situation before and after. The statistics are as follows:

It is noted that if B + C is < 30 then we will bring the correction of Yates3 to improve the approximation. In this case the statistics are as follows:

In our case, the calculation is as follows:

X2 = (0-35)2/(0 + 35) = 35

Step 4-Critical value

The practical value is going to be compared to the critical value that we are referring to the distribution law of χ2 to 1 degree of freedom.

Either it is determined by searching directly in the table of χ2, or via the Excel spreadsheet with the function: CHIINV (risk α; dof).

In our case, the critical value is 3.8415.

Step 5 – The P-Value

The P-Value of the test will allow us to conclude definitively on the model. It follows a law of χ2 and is calculated in Excel using the formula:

Chidist (practical value; 1)

In our case, this value is 0.0000000033.

This value tells us that we have only 0.000000033% chance of making an error of the first type (risk of seeing a discrepancy when there is none).

Step 6-Interpretation

ResultStatistical conclusionPractical conclusion
Practical value ≥ Critical valueWe reject H0It is concluded that our 2 sets of values ​​are statistically different at the given level of risk α.
Practical value < Critical valueWe retain H0We conclude that our 2 sets of values ​​are statistically identical or close to the given level of risk α.
ResultStatistical conclusionPractical conclusion
p-value > αWe retain H0Our data series are identical or close with a risk of being wrong with p-value%
p-Value < αWe reject H0Our data series are statistically different with a risk of being wrong with p-value%

Source

1 – Q. McNemar (1947) – Note on the sampling error of the difference between correlated proportions or percentages.

2 – J. Rice (1995) – Mathematical Statistics and data analysis.

3 – S. Siegel, N. J. Castellan (1988) – Nonparametric Statistics for the Behavioral science

Share This