Select Page
[Total: 0    Average: 0/5]
Called the Kendall rank correlation coefficient, it is a non-parametric correlation measure. It is used to determine a relationship between two sets of data.

## Introduction

Called the Kendall rank correlation coefficient, noted τ, it is a non-parametric correlation measure. It is used to determine a relationship between two sets of data.

It was Mr. Kendall, an English statistician, who, taking over the work of G. Fechner (German philosopher and physicist), developed this correlation coefficient in 1938. The challenge was to set up a correlation calculation for non-parametric data.

## The principle

The test consists in first putting in ascending order the values of the variable X1. Then, we count the number of times the variable X2 value is also increasing, or not. If the data is also growing, then we will have a positive correlation. If the data of the variable X2 are all decreasing, then there will be a negative correlation. If finally, the data of the variable X2 is neither growing nor decreasing, then there is no correlation.

## Step 1: Assumptions

Kendall Tau is a bilateral or unilateral test. The assumptions are:

For a bilateral case:

• H0: The X and y are mutually independent, there is no correlation.
• H1: The X and Y are dependent, there is a correlation.

For a right-sided case:

• H0: The X and y are mutually independent, there is no correlation.
• H1: The X and Y are dependent, there is a positive correlation.

For a left unilateral case:

• H0: The X and y are mutually independent, there is no correlation.
• H1: The X and Y are dependent, there is a negative correlation.

## Step 2: Compute the concordant and inconsistent values

Pairs of values with concordances and discrepancies are calculated. To do this, the values of one of the two variables are sorted in ascending order and associated with the values of the second variable.

In the example below, the variable X is in ascending order. The data of the variable X2 is sorted according to the variable X1. So when X1 = 1, we got 31 for X2.

To calculate the number of concordant pairs, we count the number of pairs of values that must not be re-sorted to obtain the increasing or decreasing X2 values.

For example, for the couple 11/11, we have the other 9 couples 12/12… that are well in order.

and conversely for the calculation of discordant couples.

## Step 3: Calculation of Kendall’s Tau

For the calculation of the practical value, the formula depends on whether we have duplicate values in our variables. Depending on the case, we have:

In case there is no duplicate

In case there are duplicates
With:

• CC: Total number of concordant couples
• CD: Number of discordant couples
• N: Total number of pairs of values
• k: The number of times the value Xi of the Variable I appears.
• n1 and n2: Coefficient of adjustment of the ex-aequo of the variable 1 or 2

The more the Tau tends to 1 or -1, the more there is a correlation. We consider that between 0.7 and 1 we have a positive correlation. Between -0.7 and -1, we have a negative correlation.

## Step 4: Calculating the practical value

Kendall’s Tau is a value that follows the normal average zero law. The practical value is:

Practical value = τ/σ

With:

• τ: Kendall’s Tau
• σ: Standard deviation of the distribution of Kendall’s tau which is calculated in the following way

The more the Tau tends to 1 or -1, the more correlation there is. It is considered that between 0.7 and 1 we have a strong positive correlation and conversely.

## Step 5: The critical value

For samples less than 10 pairs of values, Kendall’s exact table is used. Beyond that, the approximation given by the normal law is sufficiently significant. For this, we use the function Excel Norm.S.Inv.

The level of risk depends on the test direction :

• Bilateral : 1 – α/2
• Left unilateral: α
• Right Unilateral: 1-α

## Step 6: p-Value

The p-Value is used to evaluate the risk level of the test. Since the ranking method has the “normalization” of the data, the p-value is obtained via the formula:

P-Value = 2 * (1 – NORM.DIST (ABS (practical value)))

## Step 7: Interpretation

Test directionResultStatistical conclusionPractical conclusion
BilateralPractical value ≤ Critical value and Practical value ≥ - Critical valueWe retain H0There is no correlation between the 2 samples
Practical value ≥ Critical value and Practical value ≤ - Critical valueWe reject H0There is a correlation between the 2 samples
Unilateral rightPractical value ≤ Critical valueWe retain H0There is no positive correlation
Practical value ≥ Critical valueWe reject H0There is a positive correlation between the 2 samples
Unilateral leftPractical value ≥ Critical valueWe retain H0There is no negative correlation
Practical value ≤ Critical valueWe reject H0There is a negative correlation between the 2 samples
ResultStatistical conclusionPractical conclusion
p-value > αWe retain H0We conclude that our 2 series of data have no correlation with a risk of being wrong with p-value%
p-Value < αWe reject H0Our 2 data series have a correlation with a risk of being wrong with p-value%