Select Page
[Total: 0    Average: 0/5]
The Design Of Experiments are used to structure the collection of information to assess the links between a variable to be explained Y, and one or more explanatory variables X.

## Introduction

The Design Of Experiments allow to structure the collection of information to evaluate the links between a variable to explain Y, called response, and one or more explanatory variables X called factors (or even variable of Prediction According to standard IS0 3534-3). These links eventually allow us to identify a causal relationship or to identify an optimum to design a better product.

Mathematically speaking, this is no more or less than regression calculations. The particularity of the Design Of Experiments are based on the simultaneous variation of several factors and thus:

• to reduce the number of trials that are necessary to make a decision, in order to save time, money or simply to make it possible to identify the links between answers and factors.
• take into account the interactions between the different factors to build the prediction model.
• identify the factors influencing and quantify the effects of the factors on the response.

It should also be noted that a Design Of Experiments corresponds to the realization of N experiences which are all:

• Determined a priori.
• necessarily feasible.
• independent of each other.

## Historical

• In the 17th century: The basic principles are stated by Francis Bacon. It makes for example macerate of wheat grains in different concoctions to study the effect of it on the rate of germination.
• In the 18th century, scientists at the time deepened the subject. We find among them Antoine-Laurent de Lavoisier, Arthur Young and Johann Georg Von Zimmermann whose text we extract the following: « An experiment differs from a mere observation, in that the knowledge that an observation gives us, seems to present itself; Rather than that which an experiment provides us, is the fruit of some attempts that one makes in the purpose of seeing if something is, or is not, point1».
• In the 19th century: Introduction of the statistics in the planning of the experiments and in the analysis of the results.

It will be from 1920, by the British mathematician Sir Ronald Fisher, that we define the starting point of modern experimentation. His work, developed in the field of agronomy (he then wanted to increase agricultural yields by combining various types of fertilizers, land, varieties, methods of cultivation…), will then be improved and deployed during the century and Especially after the Second World War. Many publications appear: Statisticians such as Yates, Cochran, Plackett and Burman enrich methods2, 3, 4.

But at the time, these complex methods could only be applied by specialist statisticians. From the years 1950, in the process of improving quality, the Japanese print a new breath. Taguchi and Masuyama develop tables to simplify the implementation of Design Of Experiments and adapt to the majority of industrial problems5.

These different ways of approaching the planning of experiences have been proposed at various times, in different countries and in different disciplines. Statisticians speak of Latin squares and factorial plans. Chemists use the plans of Plackett and Burman more readily. The Assessors and specialists of the Six Sigma see only by the tables of Taguchi.

Now with the democratization of computer tools, the Design Of Experiments are increasingly used and new methods appear.

## Step 1: Defining the study problem

The first step is to define the subject of the study. It translates into a study problem that can be very diverse:

• Malfunction of a product or service.
• Identification of the settings to obtain the best performance of a machine.
• Definition of the best mix to get a product.

As for the variable to explain, the answer can be of any type (a speed, a quantity, a percentage…), but remains to be defined precisely:

• Who is doing the measurement?
• What do we measure?
• Where do we measure it?
• When we measure it?
• How do we measure it?
• How many times we measure it and in which unit?

## Step 2: Defining the factors

### 2.1-List of all factors

We need to identify the priori parameters responsible for the variations in the responses. This exhaustive census must be done in groups and based on tools such as brainstorming, 5M or FMEA.

### 2.2-The choice of factors

The number of experiments and the quality of the model depends directly on the number of factors that we will take into account. It is therefore necessary to remove the factors without influence from the outset. For this, several techniques are possible:

• Remove “a priori ” from the factors we seem unlikely: for example, in the 5M analysis of the causes to a problem, it is a question of removing the unlikely causes or those that we are certain that they can not influence the Problem.
• Use the traditional approach : We vary one factor at a time, the others being fixed, to simply see if a factor to any influence on the answer.
• Remove “uncontrollable ” factors: If one of the potential factors is difficult to measure or simply, in reality, we will not be able to ascertain its level, so it is not useful to take it into account. It will simply be said that for these trials, this factor was at this level or about that level. This is called a control variable.
• use a grid of choice: One can prioritize the factors according to different criteria that may be the level of influence assumed, the feasibility, the need for information… How to integrate a qualitative variable

as in any statistical study, all the complexity is in the treatment of variables Quality. Various solutions exist:

• transforming a qualitative factor into quantitative: This is the typical example of color. A factor of 3 modalities (blue, yellow, violet) can be converted to quantitative by speaking of wavelengths (446-520, 565-590, 380-446, respectively).
• integrating qualitative factors into “parameter”: they do not fit into the mathematical model, but allow us to perform test groups which we will be able to compare the results.
• use only the technique of screening: described below, it allows to take into account only 2 modalities, and we can already “clear the Gemba “.

### 2.3-Choosing interactions

Depending on the estimates, tests and experiments, we want to integrate or not the interactions in the model. We understand, at least we have no interactions to study at the most a plan of experiments will be reduced. What are you talking about?

It is said that there is interaction between two factors if the average effect of the one is not the same as one is placed at the low level or at the top level of the other.

Like what:

 Number of experience X1 X2 Answer Y 1 -1 -1 60 2 +1 -1 85 3 -1 +1 75 4 +1 +1 90

It is noted that when the effect X1 is at + 1, we have an average variation of 2.5 ((90-85)/2). When this effect is at-1, the average effect is 7.5 ((75-60)/2). These two numbers are different, so we have an interaction with another factor. We’ll note this interaction x1x2.

It is also noted that an interaction between 2 factors is called Order 2, an interaction between 3 factors of order 3…

The interaction between two factors x1 and x2 will, in the following, be considered as a new factor which is to be noted x1x2.

## Step 3: The choice of the experimental field

Each factor must be defined with different ” levels “. This is also an essential point for calculating the number of experiments. At the most the factors have level (example: 1, 5, 10), at the most the number of experiments increases. The number of experiments in case we want to make a complete plan, the calculation formula is:

N = number of levels Factor 1 * Number of levels factor 2…

### 3.1-Definition of factor levels

For 3 factors at 2 levels (one level is a treatment), we would be taken to do 23 experiments, or 6 experiments for a complete plan. For 3 factors with 3, 2 and 4 levels respectively, we have 24 experiments. The number of levels depends on the precision we want, but especially on the type of data:

• Quantitative factor: Generally, two levels are chosen that correspond to the lower terminal and the upper terminal. If we want to be more precise or have many doubts, we take more levels.
• Qualitative factor: The levels correspond to the number of modalities that the factor can take. However, we will have to try to limit them because the number of experiments can very quickly increase.

We will choose them according to:

• Of the realities of the system: there is no point in taking values that we know we will not be able to achieve in reality.
• From the knowledge of the system: following the prior knowledge of the system, one can define the field of study according to the desired evolutions of the answers.

This system of value will be called the experimental field which is in the Shape of a table:

 Factor 1 Factor 2 -1 2 Bar 50 °c +1 10 bar 80 °c

### 3.2-Setting parameters

They play the same role as the factors, but they are not taken into account in the definition of the experience plans, and therefore do not appear in the mathematical expressions of the models. The parameters are variables taking a finite number of distinct values. Generally, they can simply serve as ” memory aid ” to know that the tests have been carried out in a particular condition (ambient temperature…).

In a practical way, the parameters allow to involve the qualitative factors. For example, we have a factor that is the presence of a Part or not on a machine will play it an influence on the performance of it. We will define the different settings of the machine, then we’ll do 2 sets of tests: one with the Part, another one without the Part. We’ll then compare the 2 sets of results.

When more than one parameter is defined in the same study, then there is a definition of the combinations of parameters, that is, all possible combinations between the values of each parameter. Their numbers can quickly become important. The use of the parameters is tricky because extremely expensive. During the typical use of the parameters, the calculations using the experiment plans are carried out for each of these combinations.

### 3.3-Special constraints

There are two types of constraints to the development of a plan of experiments:

• prohibited Trials: There may be physical incompatibilities or any other reason that may result in a test not being carried out.
• Mandatory trials: If we know that a test can be very representative, it must be foreseen. In the same way, if we have already carried out tests we can integrate them or compare the results with new tests.

These constraints are only taken into account when they come to restrict the experimental field. If this is not the case, it will be necessary to use design techniques using the optimalités criteria. Among the most common, we find the criteria of D-optimality, A-optimality, E-optimality, G-optimality or even J-optimality.

## Step 4: Choose the model

The choice of model type of experience plan is based on the objective and the level of knowledge that we have of the system to be studied. It is pointed out, however, that for any non-complete experience plan, the economy of the number of experiments is paid:

• It is no longer possible to calculate all the interactions between all the factors. This is not a fundamental point if the objective is only to determine the relative influence of the factors on the response and therefore not to consider the interactions between the factors.
• The calculated effects are most of the aliased cases. This means that they do not directly reflect the effect of the factors considered individually but a set of factors and interactions. It is therefore sometimes impossible to conclude reliably on the effect of a factor, since in an alias, each term can prove to be influential. However, it is commonly accepted that, the higher the interactions are, the less likely they are to be influential. It is best to alias with high order interactions in priority. That leads to the concept of resolution.

### Step 4.1: ” clearing the ground ” and validating the influence of factors

The screening technique (also called Screening plan) is used. The factors are necessarily 2 levels, and the choice of the type of plan is based on the interactions that we want to take into account. It should be noted that for this type of plan, the factors can be quantitative or qualitative.

DOE Type*Nb of TrialsTaking into account interactionsAccuracy of the resultEase of constructionEasy interpretation
Full factorial plan0All++++Very easy
Fractional factorial plan+Depending on the chosen resolution+0Need to know the notions of aliases
Taguchi+Only order 2++
Plackett Burman++None00Easy

* All of these plans are based on the principle of Hadamard.

A case is however very specific, the mixtures: if we are in the case of mixture of product (i.e. the variables are dependent on each other) specific plans ” for mixtures ” using the networks of Scheffé are to be used.

More broadly, screening makes it possible to classify the factors between them according to their influence. It allows us to move forward in understanding the system and to retain only the factors of interest. To use this technique, it is best to take into account the maximum number of factors to avoid forgetting one.

### Step 4.2: Optimizing the Model

Generally use as a result of screening, the methods of the answer surfaces are used to optimize our model. We take into account only the factors that we considered influential in the screening phase, and they are evaluated by taking into account more levels and quadratic effects (i.e. the non-proportional relationships that there may be between a Factor and one response).

It should be noted that for this optimization phase, the factors are necessarily quantitative.

DOE TypeNb of trialsTaking into account interactionsAccuracy of the resultEase of constructionEasy interpretationNb of level by factor
Doelhert+Yes000X
Composite centered-Yes+++Multiple of 3 or 5
Box Behnken+Yes000Multiple of 3
Taguchi+Only order 2+++Until 5
Full factorial-All++++++X
Fractional Factorial+None+002

## Step 5: Build the Matrix of experiments

From then on, we build the matrix of experiments and answers. This is the table that shows the list of experiences to be carried out and how to vary the factors. It depends of course on all the elements chosen upstream. Like what:

 Number of experience X1 X2 Answer Y 1 -1 -1 60 2 +1 -1 85 3 -1 +1 75 4 +1 +1 90

## Step 6: Perform the tests

### 6.1 The sample size

By test, we define a number of samples that we will test. In the case of mass production, it is generally accepted that 30 samples are used per test and the response will be the average of these 30 samples.

### 6.2 The tests

The tests must be carried out in accordance with the conditions of experimentation. All parameters outside the plan must be under control and noted (ambient temperature…) and the personnel trained in the instructions.

The order of the tests is when at it random. Indeed, the conditions to be “controlled“, whatever the order of the tests, one must always obtain the same results. For this reason, the software shows a random order of the tests.

However, in practice, for reasons of feasibility, experiments are carried out in the order that suits us. Typically, if change the factor x1, it requires 2 hours of tuning, while the X2 asks us 5 minutes, we will instead do all the experiments where x1 =-1 As a result, to have to make only one change. This is a good way to verify that our tests are carried out under controlled conditions. A test can be done to verify that the result is the same as the first one. If this is not the case, the plan is not mastered, and we must investigate the subject because the process is not robust.

## Step 7: Calculate the effects

The calculations are based on the same mathematical principles as the Multiple regressions whose model is dependent on the relationship that we initially chose. We find:

 Screening Phase Model with interaction (complete, Taguchi or fractional) Y = a + a1x1 + a2x2 + a3x1x2… Model without interaction (Plackett Burman) Y = a + a1x1 + a2x2… Optimization Phase Model with Interactions Y = a + a1x1 + a2x12 + a3x13 + a4x2 + a5x1x2… Model without interactions Y = a + a1x1 + a2x12 + a3x13 + a4x2…

### Step 7.1: Set up the effects matrix

Based on the matrix of previous experiments, the matrix of effectsis constructed. This depends on the type of model of the Experiment plan chosen:

1. Add the first column of + 1 to the left of the matrix to take into account the constant a.
2. Add or not the columns of the interactions, according to a protocol identified by the different plans of experiments.
3. Follow the construction process proposed by each of the methods.

### Step 7.2: Calculating the effects

The calculation is the same as for a multiple regression. A simplified version of the classical matrix calculus complex is proposed just below, but the results are the same:

1. The transposed of the effects matrix is calculated
2. The matrix product is carried out with the answers Y
3. The results are divided by the number of tests n

In an even simpler way and to ” do it by hand “, each estimate of a coefficient is equal to the sum of the responses Y assigned the signs of the column of the effects matrix corresponding to the factor divided by the number of experiments.

Like what:

 Test A A1X1 A2X2 A12x1x2 Answer Y 1 +1 -1 -1 +1 60 2 +1 +1 -1 -1 65 3 +1 -1 +1 -1 75 4 +1 +1 +1 +1 85

We:

• A = (+ 60 + 65 + 75 + 85)/4 = 71.25
• A1 = (-60 + 65 – 75 + 85)/4 = 3.75
• A2 = (-60 – 65 + 75 + 85)/4 = 8.75
• A12 = (+ 60 – 65 – 75 + 85)/4 = 1.25

## Step 8: Calculate the coefficient of determination R2

In the same way as for Regressions, the coefficient of determination is calculated. This allows to have an indicator on the rate of variation of the answers that our model allows to explain. It is calculated in the following way: Attention, if the number of coefficient is equal to the number of tests, then the model is described descriptive and the R2 will be equal to 100%. To avoid this, we calculate the adjusted R2 , which is calculated via Formula6 : ## Step 9: Graphical reading of the results

It is customary to represent the effect of a factor by a right-hand segment whose steering coefficient is worth this effect. On the x-axis, the positions-1 and + 1 are indicated. In order, the average of the responses is indicated when the factor is at-1 or + 1. Then we trace the right that connects the 2 points.

In the example above, it is observed that the effect of the left-hand factor is greater than that on the right (the slope being more important). Note that if the lines are parallel then there is no interaction between the factors. On the contrary, if the rights are not parallel or if they intersect, then there is a strong interaction. In the example above, it is observed that the effect of the left-hand factor is greater than that on the right (the slope being more important). Note that if the lines are parallel then there is no interaction between the factors. On the contrary, if the rights are not parallel or if they intersect, then there is a strong interaction.

## Step 10: Significance of the coefficients

For each of the factors, a Anova to identify the importance of factor influence in the prediction model. The specificity lies in the fact that the residual SSE variation is equal to the sum of the SST of the different interactions.

It is essential to do this. This analysis makes it possible to remove the insignificant factors from the model and thus reduce the number of additional experiments and analyses. And so the costs.

### 10.1-Assumptions

In our case, we express the assumptions in the Shape of:

H0: ax = 0

H1: Ax ≠ 0

### 10.2-Calculation of SST variances

For each of the main factors, the differences are calculated against the average. We calculate them in the following way:

SCEi =∑nk (Yicross – Ycross)

With:

• Nk : The number of times the factor takes the value I
• YIbarre : Average of the responses when the factor took its value I
• YBar : Average of responses

### 10.3-Calculation of the SSE residual variation

The residual variation is equivalent to the sum of the differences in the different interactions. We can either calculate the whole SST of the interactions and add or subtract it via the formula:

SSE = TSS – Σ SST

With:

• TSS: square sum of total variance = Σ (yi -ybar)2
• Σ SST: Sum of all SST’s of the main factors

### 10.4-Calculation of average squares

We continue the Anova process and calculate the average squares. They are calculated in the following way:

MST x = SST x/ dof x

MSE = SSE/ dofR

With:

• MSTx : Average square of each of the main factors
• dof x : Degree of freedom of each of the main factors, always equal to 1
• MSE: Average square of residues.
• dof a : degree of freedom of residues, equal to the number of interactions

### 10.5-Calculation of the practical value

For each of the main factors, a practical value is calculated. This is calculated using the following formula:

Practical value = MSTx /MSE

The interpretation is as follows:

• practical > Value critical value: We reject H0, and it is concluded that the factor has a strong influence within the model.
• practical < Value critical value: H0 is held, and it is concluded that the factor has no influence on the model.

### 10.6-Calculation of P-Value

We calculate the P-Value That tells us the influence of each factor. She’s following Fisher’s Law. In Excel, it is calculated via the formula:

P-Value = FDIST (Practical value; dof SSTx, dofSSE)

The interpretation is as follows :

• P-Value < α: The factor is influential, it must be retained in our model.
• P-Value > α: the factor is not very influential, it must be taken out of our model.

## Step 11: Significance of the model

This test can be used to tell if the model brings us something, if the equation establishes a relationship between the variation of factors and the response, or whether it is due to a change or a random fluctuation. We’re also going to do an Anova.

### 11.1-Assumptions

In our case, we express the assumptions as :

H0: The model does not allow to describe the variation of test results

H1: The model is used to describe the variation in test results

### 11.2-Calculating bond deviations – SSTL

We add all the differences between the estimated values of our model and the average of the responses.

SCEL = Σ (y -y-bar)2

With:

• Yis : Each of the responses estimated by our model
• YBar : average of observed responses

### 11.3-Calculating model residues-SSTR

They represent the difference between the observed response and the prediction of the model we have just built. It is calculated with the following formula:

SSTR = Σ (yis – yi)2

We understand, the challenge is to be able to minimize this value, indicating that our model ” represent ” the reality.

In case our model takes into account all the factors, we cannot carry out the test because we have “used” all the degrees of freedom of the model. It is therefore necessary to remove the least influential factor (s) from the model.

### 11.4-Calculating average squares

We’re suing theAnova And we calculate the average squares. We get the following two formulas:

CMTL = SSTLL/ dofSSTL

CMTR = SSTR/ dofSSTR

With:

• dof SSTL : Degree of freedom of the model. It is equal to the number of factors and interactions considered in the minus 1 model.
• dof SSTR : Degree of freedom of residues. It is equal to the number of tests minus the number of factors and interactions considered in the model.

### 11.5-Calculation of the practical value

The practical value is the ratio between the middle squares either:

Practical value = CMTL/CMTR

The practical value expresses the ratio of 2 variances. At the most this ratio will be great, the more it is estimated that the overall variance is therefore due to the variance of our model is consequently, at the most our model is significant.

### 11.6-Critical value calculation

The practical value of the test is a random variable whose theoretical distribution follows a Fisher-Snedecor law. We always perform a straight-sided test (because we work on the basis of a Variance ratio whose numerator is necessarily higher than the denominator. So we are looking for the value in Fisher’s table for:

Critical value = (1 – α; dof SSTL ; dof SSTR)

You can find this value by reading the specific tables or using Excel via the INVERSE formula. Law. F. N (1 – α, dofSSTL, dofSSTR ).

The interpretation is as follows:

• Practical > Value critical value: One rejects H0, and it is concluded that the model allows to correctly describe the variations of the responses.
• Practical < Value critical value: One holds H0, and it is concluded that the model does not allow to correctly describe the variations of the responses.

### 11.7-Calculating the P-Value

Here too, we can calculate the P-Value. It also follows Fisher’s law. In Excel, we can calculate it using the formula:

P-Value = FDIST (Practical value; dof SSTL, dofSSTR)

The interpretation is as follows:

• P-Value < α: The test is significant is not random. The conclusion made by the previous step is valid at risk α.
• P-Value > α: The test is not significant therefore due to chance. The conclusion made by the previous step cannot be valid at risk α.

## Step 12: Concluding on the model

The conclusions on the model depend on the objectives sought. So we will find different cases detailed below.

You used the screening technique to identify influential factors:

We can stop there. Thanks to the reading of the coefficients of each factor, one establishes the Pareto Influential factors. In this type of use, it is not necessary to look for a good coefficient of determination, nor necessarily to have good results in the Anova test.

You used the screening technique to validate a prediction model:

• Case 1: the coefficient of determination and the Anova test are positive. The model will be considered satisfactory.
• Case 2: the coefficient of determination and the Anova test are negative. It may be that you have a factor influencing the response in your model, or that the relationship between factor and response is not linear. You will then have to use the response Surface plans to model your situation.

### Final Validation

In any case, it will be essential to conduct validation tests of the model outside the experimental field and compare the results with the prediction model.

The difference between reality is the model, it will be called “Experimental error” and is due to variables not put under control.

It is noted that to reduce the error, we can use “blocks“. The issue is to develop “blocks” in which all treatments will be present at least once (randomized complete block plan) or not (incomplete block randomized plan).

Example: A farmer has five parcels of land on which he wants to grow maize. It has 2 species of maize and 2 species of fertilizer available. He knows that his Gemba presents heterogeneities with respect to the sunshine. The sun is therefore a source of error but it cannot put under control this variable. We will create 5 “blocks” of parcel which apply all possible combinations:

 Treatment Block 1 2 3 4 Parcel 1 Corn 2 with Fertilizer 2 Corn 2 with Fertilizer 1 Corn 1 with Fertilizer 1 Corn 1 with Fertilizer 2 Parcel 2 Corn 2 with Fertilizer 1 Corn 1 with Fertilizer 2 Corn 1 with Fertilizer 1 Corn 2 with Fertilizer 2 Parcel 3 Corn 1 with Fertilizer 1 Corn 1 with Fertilizer 2 Corn 2 with Fertilizer 1 Corn 2 with Fertilizer 2 Parcel 4 Corn 1 with Fertilizer 1 Corn 2 with Fertilizer 1 Corn 1 with Fertilizer 2 Corn 2 with Fertilizer 2 Parcel 5 Corn 2 with Fertilizer 2 Corn 2 with Fertilizer 1 Corn 1 with Fertilizer 1 Corn 1 with Fertilizer 2

A Anova On the results of the treatments will see the differences by having remove the errors due to the sun.

## Source

1 – G. Von Zimmermann (1774) – Treaty of experience in general, and in particular in the art of healing

2 – Yates (1937) – The design and analysis of factorial experiments

3 – W. G. Cochran, G. M. Cox (1957) – Experimental Design

4 – R. L. Plackett, J. P. Burman (1946) – The design of Optimum multifactorial experiments

5 – G. Taguchi (1986) – Introduction to quality engineering

6-R. Linder (2005) – Plans of experiments

7 – J. J. Droesbeke, J. Fine, G. Saporta (1997) – Plans of experiments

J. J. Sylvester (1867)-Thoughts on inverse orthogonal matrixes, simultaneous sign-succession, and tessellated pavements in two or more colors, with applications to Newton rule, ornamental tile-work, and the theory of numbers

J. J. Droesbeke, J. Fine, G. Soporta (1996) – Plans of experiments, application to the company

D. Behera, Y. Tourbier, S. Germain-Tourbier (1994) – Plans of experiments: Construction and analysis

A. Muren (2012) – Methodology of experience plans

P. Schimmerling, J. C. Sisson, A. Zaidi (1998) – Practice of experience plans

S. Vivier (2009) – Method of experience plans

J. L. Gillo (1990) – Comparative study of various plans of experiments

J. P. Warped (2005) – Optimal experience Plans: a didactic presentation

NF X06-080 standard-Experience Plan