[Total: 0    Average: 0/5]
Simple linear regressions are used to identify a correlation between 2 variables and to quantify this relationship.

Introduction

Simple linear regressions can predict the behavior of one variable by another. It is a powerful tool for analyzing cause-to-effect relationships.

It is sought to develop a function y = f (X) in the Shape : y = AX + B, where:

  • A is the slope = σxy2x
  • B is the Intercept = ycross -A * xcross

Validate the significance of the result

To validate the significance of the result, different hypothesis tests are carried out. There are three different cases that we detail below.

Perform the test on the correlation coefficient

We perform a Student test On the correlation coefficient obtained during the calculation of the regression.

We’re asking the assumptions:

  • H0: R = 0
  • H1: R ≠ 0

The objective is to reject the H0 hypothesis, and thus to validate the fact that there is indeed a correlation.

The practical value and the theoretical value are calculated according to the following model:

If practical > value critical value: The result is significant, therefore not due to chance. We reject H0 and hold H1

Perform a test on the slope of the regression line

On the same principle, one tests via a Student test, if the slope of the correlation identified is significant or not.

We’re asking the assumptions:

  • H0: a = 0
  • H1: a ≠ 0

The objective is to reject the H0 hypothesis, and thus to validate the significance of the slope.

The practical value and the theoretical value are calculated according to the following model:

So if the practical value is greater than the critical value, then the result is significant and not random. Thus, H0 is rejected and the H1 hypothesis is held.

Calculate the confidence interval of the slope

We calculate a Confidence interval for the slope. This interval allows us to best adjust our equation and develop scenarios. It is calculated as follows:

Calculate the P-Value

We calculate the p-Value for, at the risk We have chosen, we indicate the level of chance of the result obtained. It is always interpreted in the same way:

  • p-Value < α: forte significance
  • p-Value > α: few significance

Calculate a partial correlation coefficient

Sometimes a meaningful conclusion can be obtained while being false. Other parameters, not considered during the study, may be present and not detected. For example, there is a correlation between the sale of ice and the sale of fan. Yet there is no direct link between them but they are dependent on a third factor, the heat.

To verify this, a partial correlation coefficient is used. We choose another parameter that we think is potentially involved, and we calculate the different correlation coefficients and then the partial correlation coefficient according to the following formula:

It is interpreted in the following way:

  • rxyz = rxy: The third variable has no interaction with the first 2
  • rxyz ≠ rXY : The third variable comes into consideration in the correlation

In the same way as for the correlation coefficient, a student test is performed on the correlation coefficient obtained during the calculation of the regression.

We’re asking the assumptions:

  • H0: Rxyz = 0
  • H1: Rxyz ≠ 0

The objective sought is to retain the H0 hypothesis, and thus to validate the fact that there is no interaction with a third coefficient.

The practical value and the critical value are calculated according to the following model:

Interpretation

  • Practical < Value critical value: one rejects H0: It is concluded that there is an interaction of the third factor on our model.
  • Practical > Value critical value: One holds H0: It is concluded that there is no interaction of the third factor on our model.

Source

D. N. Gujarati (2004)-Econometrics

R. Rafiq (2012)-Correlation analysis

J. Perch (2012)-Correlation and simple linear regression

R. Vuillet, J. J. Daudin, S. Robin (2001)-Inferential statistics

S. Robin (2007)-Simple linear regression

Share This