MATH3014-6027-实验设计代写|学霸联盟

MATH3014-6027-实验设计代写

时间：2023-05-16

MATH3014-6027 Design (and Analysis) of
Experiments
Dave Woods
2022-02-28
2
Contents
Preface 5
1 Motivation, introduction and revision 7
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2 Aims of experimentation and some examples . . . . . . . . . . . 12
1.3 Some definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4 Principles of experimentation . . . . . . . . . . . . . . . . . . . . 13
1.5 Revision on the linear model . . . . . . . . . . . . . . . . . . . . 15
1.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2 Completely randomised designs 25
2.1 A unit-treatment linear model . . . . . . . . . . . . . . . . . . . . 27
2.2 The partitioned linear model . . . . . . . . . . . . . . . . . . . . 28
2.3 Reduced normal equations for the CRD . . . . . . . . . . . . . . 29
2.4 Contrasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.5 Treatment contrast estimators in the CRD . . . . . . . . . . . . . 31
2.6 Analysing CRDs in R . . . . . . . . . . . . . . . . . . . . . . . . 33
2.7 Multiple comparisons . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.8 Impact of design choices on estimation . . . . . . . . . . . . . . . 38
2.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3 Blocking 53
3.1 Unit-block-treatment model . . . . . . . . . . . . . . . . . . . . . 56
3.2 Normal equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.3 Analysis of variance . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.4 Randomised complete block designs . . . . . . . . . . . . . . . . 63
3.5 Orthogonal blocking . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.6 Balanced incomplete block designs . . . . . . . . . . . . . . . . . 67
3.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4 Factorial experiments 71
5 Blocking in factorial designs 73
3
4 CONTENTS
6 Fractional factorial designs 75
7 Response surface methodology 77
8 Optimal design of experiments 79
Preface
These are draft lecture notes for the modules MATH3014 and MATH6027 De-
sign (and Analysis) of Experiments at the University of Southampton for aca-
demic year 2021-22. They are very much work in progress.
Southampton prerequisites for this module are MATH2010 or MATH6174 and
STAT6123 (or equivalent modules on linear modelling).
5
6 CONTENTS
Chapter 1
Motivation, introduction
and revision
Definition 1.1. An experiment is the process through which data are col-
lected to answer a scientific question (physical science, social science, actuarial
science …) by deliberately varying some features of the process under study
in order to understand the impact of these changes on measureable responses.
In this course we consider only intervention experiments, in which some aspects
of the process are under the experimenters’ control. We do not consider surveys
or observational studies.
Definition 1.2. Design of experiments is the topic in Statistics concerned
with the selection of settings of controllable variables or factors in an experiment
and their allocation to experimental units in order to maximise the effectiveness
of the experiment at achieving its aim.
People have been designing experiments for as long as they have been exploring
the natural world. Collecting empirical evidence is key for scientific development,
as described in terms of clinical trials by xkcd:
Some notable milestones in the history of the design of experiments include:
• prior to the 20th century:
– Francis Bacon (17th century; pioneer of the experimental methods)
– James Lind (18th century; experiments to eliminate scurvy)
– Charles Peirce (19th century; advocated randomised experiments and
randomisation-based inference)
• 1920s: agriculture (particularly at the Rothamsted Agricultural Research
Station)
• 1940s: clinical trials (Austin Bradford-Hill)
• 1950s: (manufacturing) industry (W. Edwards Deming; Genichi Taguchi)
• 1960s: psychology and economics (Vernon Smith)
7
8 CHAPTER 1. MOTIVATION, INTRODUCTION AND REVISION
• 1980s: in-silico (computer experiments)
• 2000s: online (A/B testing)
See Luca and Bazerman (2020) for further history, anecdotes and examples,
especially from psychology and technology.
Figure 1.1 shows the Broadbalk agricultural field experiment at Rothamsted,
one of the longest continuous running experiments in the world, which is testing
the impact of different manures and fertilizers on the growth of winter wheat.
Figure 1.1: The Broadbalk experiment, Rothamsted (photograph taken 2016)
1.1 Motivation
Example 1.1. Consider an experiment to compare two treatments (e.g. drugs,
diets, fertilisers, …). We have subjects (people, mice, plots of land, …), each
of which can be assigned one of the two treatments. A response (protein mea-
surement, weight, yield, …) is then measured.
Question: How many subjects should be assigned to each treatment to gain
the most precise1 inference about the difference in response from the two treat-
ments?
Consider a linear statistical model2 for the response (see MATH2010 or
MATH6174/STAT6123):
1Smallest variance.
2In this course, we will almost always start with a statistical model which we wish to use
to answer our scientific question.
1.1. MOTIVATION 9
= 0 + 1 + , = 1,… , , (1.1)
where ∼ (0, 2) are independent and identically distributed errors and
0, 1 are unknown constants (parameters).
Let3
= {
−1 if treatment 1 is applied to the th subject
+1 if treatment 2 is applied to the th subject,
for = 1,… , .4
The difference in expected response from treatments 1 and 2 is
E[ | = +1] − E[ | = −1] = 0 + 1 − 0 + 1
= 21 .
(1.2)
Therefore, we require the the most precise estimator of 1 possible. That is, we
wish to make the variance of our estimator of 1 as small as possible.
Parameters 0 and 1 can be estimated using least squares (see MATH2010 or
MATH6174/STAT6123). For 1,… , , we can write the model down in matrix
form:
⎡⎢
⎣
1
⋮

⎤⎥
⎦
= ⎡⎢
⎣
1 1
⋮ ⋮
1
⎤⎥
⎦
[ 01
] + ⎡⎢
⎣
1
⋮

⎤⎥
⎦
.
Or, by defining some notation:
= + (1.3)
where
• - × 1 vector of responses;
• - × model matrix;
• - × 1 vector of parameters;
• - × 1 vector of errors.
The least squares estimators, ̂, are chosen such that the quadratic form
( − )T( − )
3Other codings can be used: e.g. 0,1; see later in the module. It makes no difference for
our current purpose.
4We will discuss the choice of coding -1, +1 later.
10 CHAPTER 1. MOTIVATION, INTRODUCTION AND REVISION
is minimised (recall that E(Y) = ). Therefore
̂ = argmin(
T + TT − 2TT ) .
If we differentiate with respect to 5,
= 2
T − 2T ,
and equate to 0, we get the estimators
̂ = (T)−1T . (1.4)
These are the least squares estimators.
For Example 1.1,
= ⎡⎢
⎣
1 1
⋮ ⋮
1
⎤⎥
⎦
, T = [ ∑∑ ∑2
] ,
(T)−1 = 1∑2 − (∑)2
[ ∑
2
−∑
−∑
] , T = [ ∑∑
] .
Then,
̂ = [
̂0
̂1
] = 1∑2 − (∑)2
[ ∑
2
−∑
−∑
] [ ∑∑
]
= 1∑2 − (∑)2
[ ∑∑
2
−∑∑
∑ −∑∑
] . (1.5)
We don’t usually work through the algebra in such detail; the matrix form is
often sufficient for theoretical and numerical calculations and software, e.g. R,
can be used.
The precision of ̂ is measured via the variance-covariance matrix, given by
Var(̂) = Var{(T)−1T } (1.6)
= (T)−1TVar( )(T)−1 (1.7)
= (T)−12 , (1.8)
where ∼ (, 2), where is an × identity matrix.
5Check the Matrix Cookbook for matrix calculus, amongst much else.
1.1. MOTIVATION 11
Hence, in our example,
Var(̂) = 1∑2 − (∑)2
[ ∑
2
−∑
−∑
]2
= [ Var(
̂0) Cov( ̂0, ̂1)
Cov( ̂0, ̂1) Var( ̂1)
] .
For estimating the difference between treatments, we are interested in
Var( ̂1) =

∑2 − (∑)2
2
= 2 − (∑)2
2 ,
as = ±1, therefore 2 = 1 for all = 1,… , , and hence ∑2 = .
To achieve the most precise estimator, we need to minimise Var( ̂1) or equiv-
alently minimise |∑|. This goal can be achieved through the choice of
1,… , :
• as each can only take one of two values, -1 or +1, this is equivalent to
choosing the numbers of subjects assigned to treatment 1 and treatment
2;
• call these 1 and 2 respectively, with 1 + 2 =
It is obvious that ∑ = 0 if and only if 1 = 2. Therefore, assuming is
even, the optimal design has
• 1 = 2 subjects assigned to treatment 1 and
• 2 = 2 subjects assigned to treatment 2.
For odd, we choose 1 = +12 , 2 = −12 , or vice versa.
Definition 1.3. We can assess different designs using their efficiency:
Eff = Var(
̂1 | ∗)
Var( ̂1 | 1)
(1.9)
where 1 is a design we want to assess and ∗ is the optimal design with smallest
variance. Note that 0 ≤ Eff ≤ 1.
In Figure 1.2 below, we plot this efficiency for Example 1.1, using different
choices of 1. The total number of runs is fixed at = 100, and the function
eff calculates the efficiency from Definition 1.3 for a design with 1 subjects
assigned to treatment 1. Clearly, efficiency of 1 is achieved when 1 = 2 (equal
allocation of treatments 1 and 2). If 1 = 0 or 1 = 1, the efficiency is zero;
we cannot estimate the difference between two treatments if we only allocate
subjects to one of them.
12 CHAPTER 1. MOTIVATION, INTRODUCTION AND REVISION
n <- 100
eff <- function(n1) 1 - ((2 * n1 - n) / n)^2
curve(eff, from = 0, to = n, ylab = "Eff", xlab = expression(n[1]))
0 20 40 60 80 100
0.
0
0.
2
0.
4
0.
6
0.
8
1.
0
n1
Ef
f
Figure 1.2: Efficiencies for designs for Example 1.1 with different numbers, 1,
of subjects assigned to treatment 1 when the total number of subjects is = 100.
1.2 Aims of experimentation and some examples
Some reasons experiments are performed:
1. Treatment comparison (Chapters 2 and 3)
• compare several treatments (and choose the best)
• e.g. clinical trial, agricultural field trial
2. Factor screening (Chapters 4, 5 and 6)
• many complex systems may involve a large number of (discrete) factors
(controllable features)
• which of these factors have a substantive impact?
• (relatively) small experiments
• e.g. industrial experiments on manufacturing processes
3. Response surface exploration (Chapter 7)
• detailed description of relationship between important (continuous) vari-
ables and response
1.3. SOME DEFINITIONS 13
• typically second order polynomial regression models
• larger experiments, often built up sequentially
• e.g. alcohol yields in a pharmaceutical experiments
4. Optimisation (Chapter 7)
• finding settings of variables that lead to maximum or minimum response
• typically use response surface methods and sequential “hill climbing’ ’
strategy
1.3 Some definitions
Definition 1.4. The response is the outcome measured in an experiment;
e.g. yield from a chemical process. The response from the observations are
denoted 1,… , .
Definition 1.5. Factors (discrete) or variables (continuous) are features
which can be set or controlled in an experiment; denotes the number of
factors or variables under investigation. For discrete factors, we call the possi-
ble settings of the factor its levels. We denote by the value taken by factor
or variable in the th run of the experiment ( = 1,… ,; = 1,… , ).
Definition 1.6. The treatments or support points are the distinct combi-
nations of factor or variable values in the experiment.
Definition 1.7. An experimental unit is the basic element (material, animal,
person, time unit, …) to which a treatment can be applied to produce a response.
In Example 1.1 (comparing two treatments):
• Response : Measured outcome, e.g. protein level or pain score in clinical
trial, yield in an agricultural field trial.
• Factor : “treatment” applied
• Levels
treatment 1 = −1
treatment 2 = +1
• Treatment or support point: Two treatments or support points
• Experimental unit: Subject (person, animal, plot of land, …).
1.4 Principles of experimentation
Three fundamental principles that need to be considered when designing an
experiment are:
• replication
• randomisation
• stratification (blocking)
14 CHAPTER 1. MOTIVATION, INTRODUCTION AND REVISION
1.4.1 Replication
Each treatment is applied to a number of experimental units, with the th
treatment replicated times. This enables the estimation of the variances of
treatment effect estimators; increasing the number of replications, or replicates,
decreases the variance of estimators of treatment effects. (Note: proper replica-
tion involves independent application of the treatment to different experimental
units, not just taking several measurements from the same unit).
1.4.2 Randomisation
Randomisation should be applied to the allocation of treatments to units. Ran-
domisation protects against bias; the effect of variables that are unknown and
potentially uncontrolled or subjectivity in applying treatments. It also provides
a formal basis for inference and statistical testing.
For example, in a clinical trial to compare a new drug and a control random
allocation protects against
• “unmeasured and uncontrollable” features (e.g. age, sex, health)
• bias resulting from the clinician giving new drug to patients who are sicker.
Clinical trials are usually also double-blinded, i.e. neither the healthcare profes-
sional nor the patient knows which treatment the patient is receiving.
1.4.3 Stratification (or blocking)
We would like to use a wide variety of experimental units (e.g. people or plots of
land) to ensure coverage of our results, i.e. validity of our conclusions across the
population of interest. However, if the sample of units from the population is too
heterogenous, then this will induce too much random variability, i.e. increase 2
in ∼ (0, 2), and hence increase the variance of our parameter estimators.
We can reduce this extraneous variation by splitting our units into homogenous
sets, or blocks, and including a blocking term in the model. The simplest
blocked experiment is a randomised complete block design, where each
block contains enough units for all treatments to be applied. Comparisons can
then be made within each block.
Basic principle: block what you can, randomise what you cannot.
Later we will look at blocking in more detail, and the principle of incomplete
blocks.
1.5. REVISION ON THE LINEAR MODEL 15
1.5 Revision on the linear model
Recall: = + , with ∼ (0, 2). Let the th row of be denoted T ,
which holds the values of the predictors, or explanatory variables, for the th
observation. Then
= T + , = 1,… , .
For example, quite commonly, for continuous variables
= (1, 1, 2,… , )T ,
and so
T = 0 + 11 +⋯+ .
The laest squares estimators are given by
̂ = (T)−1T ,
with
Var(̂) = (T)−12 .
1.5.1 Variance of a Prediction/Fitted Value
A prediction of the mean response at point 0 (which may or may not be in the
design) is
̂0 = T0 ̂ ,
with
Var( ̂0) = Var (T0 ̂)
= T0Var(̂)0
= T0 (T)−102 .
For a linear model, this variance depends only on the assumed regression model
and the design (through ), the point at which prediction is to be made (0)
and the value of 2; it does not depend on data or parameters .
Similarly, we can find the variance-covariance matrix of the fitted values:
Var( ̂ ) = Var(̂) = (T)−1T2 .
16 CHAPTER 1. MOTIVATION, INTRODUCTION AND REVISION
1.5.2 Analysis of Variance and R2 as Model Comparison
To assess the goodness-of-fit of a model, we can use the residual sum of squares
RSS = ( −̂)T( − ̂)
=

∑
=1
{ − T ̂}
2
=

∑
=1
2 ,
where
= − T ̂ .
Often, a comparison is made to the null model
= 0 + ,
i.e. ∼ (0, 2). The residual sum of squares for the null model is given by
RSS(null) = T − ̄ 2 ,
as
̂0 = ̄ =
1
∑
=1
.
How do we compare these models?
1. Ratio of residual sum of squares:
2 = 1 − RSSRSS(null)
= 1 − ( −̂)
T( − ̂)
T − ̄ 2
.
The quantity 0 ≤ 2 ≤ 1 is sometimes called the coefficient of multiple
determination:
• high 2 implies that the model describes much of the variation in the
data;
1.5. REVISION ON THE LINEAR MODEL 17
• but note that 2 will always increase as (the number of explanatory
variables) increases, with 2 = 1 when = ;
• some software packages will report the adjusted 2.
2 = 1 −
RSS/( − )
RSS(null)/( − 1)
= 1 − ( −̂)
T( − ̂)/( − )
( T − ̄ 2)/( − 1)
;
• 2 does not necessarily increase with (as we divide by degrees of freedom
to adjust for complexity of the model).
2. Analysis of variance (ANOVA): An ANOVA table is compact way of pre-
senting the results of (sequential) comparisons of nested models. You
should be familiar with an ANOVA table of the following form.
Table 1.1: A standard ANOVA table.
Source
Degress of
Freedom
(Sequential) Sum
of Squares Mean Square
Regression − 1 By subtraction;
see (1.12)
Reg SS/( − 1)
Residual − ( −̂)T( −
̂)6
RSS/( − )
Total − 1 T − ̄ 27
In row 1 of Table 1.1 above,
Regression SS = Total SS − RSS = T − ̄ 2 − ( −̂)T( − ̂)
(1.10)
= − ̄ 2 − ̂
T
(T)̂ + 2̂
T
T (1.11)
= ̂
T
(T)̂ − ̄ 2 , (1.12)
with the last line following from
̂
T
T = ̂
T
(T)(T)−1T
= ̂
T
(T)̂
6Residual sum of squares for the full regression model.
7Residual sum of squares for the null model.
18 CHAPTER 1. MOTIVATION, INTRODUCTION AND REVISION
This idea can be generalised to the comparison of a sequence of nested models -
see Problem Sheet 1.
Hypothesis testing is performed using the mean square:
Regression SS
− 1 =
̂
T
(T)̂ − ̄ 2
− 1 .
Under H0 ∶ 1 = ⋯ = −1 = 0
Regression SS/( − 1)
RSS/( − ) =
(̂
T
(T)̂ − ̄ 2)/( − 1)
( − ̂)T( − ̂)/( − )
∼ −1,− ,
an distribution with − 1 and − degrees of freedom; defined via the ratio
of two independent 2 distributions.
Also,
RSS
− =
( −̂)T( − ̂)
− = ̂
2
is an unbiased estimator for 2, and
( − )
2 ̂
2 ∼ 2− .
This is a Chi-squared distribution with − degrees of freedom (see MATH2010
or MATH6174 notes).
1.6 Exercises
1. (Adapted from Morris, 2011) A classic and famous example of a simple
hypothetical experiment was described by Fisher (1935):
A lady declares that by tasting a cup of tea made with milk
she can discriminate whether the milk or the tea infusion was
added first to the cup. We will consider the problem of de-
signing an experiment by means of which this assertion can be
tested. For this purpose let us first lay down a simple form of
experiment with a view to studying its limitations and its char-
acteristics, both those that same essential to the experimental
1.6. EXERCISES 19
method, when well developed, and those that are not essential
but auxiliary.
Our experiment consists in mixing eight cups of tea, four in one
way and four in the other, and presenting them to the subject
for judgement in a random order. The subject has been told in
advance of what the test will consist, namely that she will be
asked to taste eight cups, that these shall be four of each kind,
and that they shall be presented to her in a random order, that is
an order not determined arbitrarily by human choice, but by the
actual manipulation of the physical appartatus used in games of
chance, cards, dice, roulettes, etc., or, more expeditiously, from
a published collection of random sampling numbers purporting
to give the actual results of such manipulation8. Her task is to
divide the 8 cups into two sets of 4, agreeing, if possible, with
the treatments received.
a. Define the treatments in this experiment.
b. Identify the units in this experiment.
c. How might a “physical appartatus” from a “game of chance” be used
to perform the randomisation. Explain one example.
d. Suppose eight tea cups are available for this experiment but they are
not identical. Instead they come from two sets. Foru are made from
heavy, thick porcelain; four from much lighter china. If each cup
can only be used once, how might this fact be incorporated into the
design of the experiment?
Solution
a. There are two treatments in the experiment - the two ingredients “milk
first” and “tea first”.
b. The experimental units are the “cups of tea”, made up from the tea and
milk used and also the cup itself.
c. The simplest method here might be to select four black playing cards
and four red playing cards, assign one treatment to each colour, shuffle
the cards, and then draw them in order. The colour drawn indicates the
treatment that should be used to make the next cup of tea. This operation
would give one possible randomisation.
We could of course also use R.
sample(rep(c("Milk first", "Tea first"), c(4, 4)), size = 8, replace = F)
## [1] "Milk first" "Tea first" "Tea first" "Milk first" "Milk first"
## [6] "Milk first" "Tea first" "Tea first"
8Now, we would use routines such as sample in R.
20 CHAPTER 1. MOTIVATION, INTRODUCTION AND REVISION
d. Type of cup could be considered as a blocking factor. One way of incor-
porating it would be to split the experiment into two (blocks), each with
four cups (two milk first, two tea first). We would still wish to randomise
allocation of treatments to units within blocks.
# block 1
sample(rep(c("Milk first", "Tea first"), c(2, 2)), size = 4, replace = F)
## [1] "Milk first" "Milk first" "Tea first" "Tea first"
# block 2
sample(rep(c("Milk first", "Tea first"), c(2, 2)), size = 4, replace = F)
## [1] "Tea first" "Milk first" "Milk first" "Tea first"
2. Consider the linear model
= + ,
with an ×1 vector of responses, a × model matrix and a ×1
vector of independent and identically distributed random variables with
constant variance 2.
a. Derive the least squares estimator ̂ for this multiple linear regression
model, and show that this estimator is unbiased. Using the definition
of (co)variance, show that
Var(̂) = (T)−1 2 .
b. If ∼ (0, 2), with being the × identity matrix, show
that the maximum likelihood estimators for coincide with the least
squares estimators.
Solution
a. The method of least squares minimises the sum of squared differences
between the responses and the expected values, that is, minimises the
expression
( − )T( − ) = T − 2TT + TT .
Differentiating with respect to the vector , we obtain
= −2
T + 2T .
Set equal to 0 and solve:
T̂ = T ⇒ ̂ = (T)−1T .
1.6. EXERCISES 21
The estimator ̂ is unbiased:
(̂) = (T)−1T() = (T)−1T = ,
and has variance:
Var(̂) = {[̂ − (̂)] [̂ − (̂)]
T
}
= {[̂ − ] [̂ − ]
T
}
= {̂̂
T
− 2̂
T
+ T}
= {(T)−1TT (T)−1 − 2T (T)−1 + T}
= (T)−1T(T) (T)−1 − 2(T) (T)−1 + T
= (T)−1T [Var() + ()(T)] (T)−1 − 2TT (T)−1 + T
= (T)−1T [2 +TT] (T)
−1 − T
= (T)−1 2 .
b. As ∼ (, 2), the likelihood is given by
( ; ) = (22)−/2 exp{− 122 ( − )
T( − )} .
The log-likelihood is given by
( ; ) = − 122 ( − )
T( − ) + constant .
Up to a constant, this expression is −1× the least squares equations; hence
maximising the log-likelihood is equivalent to minimising the least squares
equation.
3. Consider the two nested linear models
(i) = 0 + 11 + 22 +…+ 11 + , or = 11 + ,
(ii) = 0+11+22+…+11+1+1(1+1)+…+22+,or = 11 +22 +
with ∼ (0, 2), and , independent ( ∼ (0, 2)).
a. Construct an ANOVA table to compare model (ii) with the null model
= 0 + .
22 CHAPTER 1. MOTIVATION, INTRODUCTION AND REVISION
b. Extend this ANOVA table to compare models (i) and (ii) by further
decomposing the regression sum of squares for model (ii).
Hint: which residual sum of squares are you interested in to compare
models (i) and (ii)?
You should end up with an ANOVA table of the form
Source Degrees of freedom Sums of squares Mean square
Model (i) 1 ? ?
Model (ii) 2 ? ?
Residual − 1 − 2 − 1 ? ?
Total − 1 T − ̄ 2
The second row of the table gives the extra sums of squares for the
additional terms in fitting model (ii), over and above those in model (i).
c. Calculate the extra sum of squares for fitting the terms in model (i),
over and above those terms only in model (ii), i.e. those held in 22.
Construct an ANOVA table containing both the extra sum of squares
for the terms only in model (i) and the extra sum of squares for the
terms only in model (ii). Comment on the table.
Solution
a. From lectures
Source Degrees of freedom Sums of squares Mean square
Regression 1 + 2 ̂
T
T̂ − ̄ 2 (̂
T
T̂ − ̄ 2)/(1+
2)
Residual − 1 − 2 − 1 ( − ̂)T( − ̂) RSS/( − 1 −
2 − 1)
Total − 1 T − ̄ 2
where
RSS(null) - RSS(ii) = T − ̄ 2 − ( −̂)T( − ̂)
= T − ̄ 2 − T + 2T̂ − ̂
T
T̂
= 2̂
T
T̂ − ̂
T
T̂ − ̄ 2
= ̂
T
T̂ − ̄ 2 .
1.6. EXERCISES 23
b. To compare model (i) with the null model, we compute
RSS(null) - RSS(i) = T − ̄ 2 − ( −1̂1)T( − 1̂1)
= ̂
T
1T1 1̂1 − ̄ 2 .
To compare models (i) and (ii), we compare them both to the null model,
and look at the difference between these comparisons:
[RSS(null) - RSS(ii)] - [RSS(null) - RSS(i)] = RSS(i) - RSS(ii)
= ( − 1̂1)T( − 1̂1) − ( − ̂)T( − ̂)
= ̂
T
T̂ − ̂
T
1T1 1̂1 .
Source Degrees of freedom Sums of squares Mean square
Regression 1 + 2 ̂T̂ − ̄ 2 (̂T̂ − ̄ 2) /(1+
2)
Model
(i)
1 ̂
T
1T1 1̂1 − ̄ 2 (̂
T
1T1 1̂1 − ̄ 2)/1
Extra
due to
Model
(ii)
2 ̂
T
T̂ −
̂
T
1T1 1̂1
(̂
T
T̂ − ̂
T
1T1 1̂1)/2
Residual − 1 − 2 − 1 ( − ̂)T( − ̂) RSS/( − 1 −
2 − 1)
Total − 1 T − ̄ 2
By definition, the Model (i) SS and the Extra SS for Model (ii) sum to the
Regression SS.
a. The extra sum of squares for the terms in model (i) over and above those
in model (ii) are obtained through comparison of the models
ia. = 22 + ,
iia. = 11 +22 + = +
Extra sum of squares for model (iia):
24 CHAPTER 1. MOTIVATION, INTRODUCTION AND REVISION
[RSS(null) - RSS(iia)] - [RSS(null) - RSS(ia)] = RSS(ia) - RSS(iia)
= ( − 2̂2)T( − 2̂2) − ( − ̂)T( − ̂)
= ̂
T
T̂ − ̂
T
2T2 2̂2 .
Hence, an ANOVA table for the extra sums of squares is given by
Source Degrees of freedom Sums of squares Mean square
Regression 1 + 2 ̂T̂ − ̄ 2 (̂T̂ − ̄ 2) /(1+
2)
Extra
Model
(i)
1 ̂
T
T̂ −
̂
T
2T2 2̂2
(̂
T
T̂ − ̂
T
2T2 2̂2)/1
Extra
Model
(ii)
2 ̂
T
T̂ −
̂
T
1T1 1̂1
(̂
T
T̂ − ̂
T
1T1 1̂1)/2
Residual − 1 − 2 − 1 ( − ̂)T( − ̂) RSS/( − 1 −
2 − 1)
Total − 1 T − ̄ 2
Note that for these adjusted sums of squares, in general the extra sum of squares
for model (i) and (ii) do not sum to the regression sum of squares. This will only
be the case if the columns of 1 and 2 are mutually orthogonal, i.e. T1 2 = 0.
Chapter 2
Completely randomised
designs
The simplest form of experiment we will consider compares different unstruc-
tured treatments. By unstructured, we mean the treatments form a discrete
collection, not related through the settings of other experimental features (com-
pare with factorial experiments in Chapter 4). We also make the assumption
that there are no restrictions in the randomisation of treatments to experimental
units (compare with Chapter 3 on blocking). A designs for such an experiment
is therefore called a completely randomised design (CRD).
Example 2.1. Pulp experiment (Wu and Hamada, 2009, ch. 2)
In a paper pulping mill, an experiment was run to examine differences between
the reflectance (brightness; ratio of amount of light leaving a target to the
amount of light striking the target) of sheets of pulp made by = 4 operators.
The data are given in Table 2.1 below.
pulp <- data.frame(operator = rep(factor(1:4), 5),
repetition = rep(1:5, rep(4, 5)),
reflectance = c(59.8, 59.8, 60.7, 61.0, 60.0, 60.2, 60.7, 60.8,
60.8, 60.4, 60.5, 60.6, 60.8, 59.9, 60.9, 60.5, 59.8, 60.0, 60.3, 60.5)
)
knitr::kable(
tidyr::pivot_wider(pulp, names_from = operator, values_from = reflectance)[, -1],
col.names = paste("Operator", 1:4),
caption = "Pulp experiment: reflectance values (unitless) from four different operators."
)
The experiment has one factor (operator) with four levels (sometimes called a
one-way layout). The CRD employed has equal replication of each treatment
(operator).
25
26 CHAPTER 2. COMPLETELY RANDOMISED DESIGNS
Table 2.1: Pulp experiment: reflectance values (unitless) from four different
operators.
Operator 1 Operator 2 Operator 3 Operator 4
59.8 59.8 60.7 61.0
60.0 60.2 60.7 60.8
60.8 60.4 60.5 60.6
60.8 59.9 60.9 60.5
59.8 60.0 60.3 60.5
We can informally compare the responses from these four treatments graphically.
boxplot(reflectance ~ operator, data = pulp)
1 2 3 4
59
.8
60
.2
60
.6
61
.0
operator
re
fle
ct
an
ce
Figure 2.1: Pulp experiment: distributions of reflectance from the four opera-
tors.
Figure 2.1 shows that, relative to the variation, there may be a difference in
the mean response between treatments 1 and 2, and 3 and 4. In this chapter,
we will see how to make this comparison formally using linear models, and to
assess how the choice of design impacts our results.
Throughout this chapter we will assume the th treatment is applied to
experimental unit, with total number of runs = ∑=1 in the experiment.
2.1. A UNIT-TREATMENT LINEAR MODEL 27
2.1 A unit-treatment linear model
An appropriate, and common, model to describe data from such experiments
when the response is continuous is given by
= + + , = 1,… , ; = 1,… , , (2.1)
where is the response from the th application of treatment , is a constant
parameter, is the effect of the th treatment, and is the random individual
effect from each experimental unit with () = 0 and Var() = 2. All
random errors are assumed independent and here we also assume ∼ (0, 2).
Model (2.1) assumes that each treatment can be randomly allocated to one
of the experimental units, and that the response observed is independent
of the allocation of all the other treatments (the stable unit treatment value
assumption or SUTVA).
Why is this model appropriate and commonly used? The expected response
from the application of the th treatment is
() = + .
The parameter can be thought of as representing the impact of many different
features particular to this experiment but common to all units, and is the
deviation due to applying treatment . From the applicable of two different
hypothetical experiments, A and B, the expected response from treatment
may be different due to a different overall mean. From experiment A:
() = A + .
From experiment B:
() = B + .
But the difference between treatments and (, = 1,… , )
() − () = + − −
= − ,
is constant across different experiments. This concept of comparison underpins
most design of experiments, and will be applied throughout this module.
28 CHAPTER 2. COMPLETELY RANDOMISED DESIGNS
2.2 The partitioned linear model
In matrix form, we can write model (2.1) as
= 1 +2 + ,
where 1 = 1, the -vector with every entry equal to one,
2 =

⨁
=1
1 =
⎡
⎢⎢
⎣
11 01 ⋯ 01
02 12 ⋯ 02
⋮ ⋱ ⋮
0 0 ⋯ 1
⎤
⎥⎥
⎦
,
with ⨁ denoting “direct sum”, 0 is the -vector with every entry equal to
zero, = [1,… , ]T and = [11,… , ]T.
Why are we partitioning the model? Going back to our discussion of the role of
and , it is clear that we not interested in estimating , which represents an
experiment-specific contribution to the expected mean. Our only interest is in
estimating the (differences between the) . Hence, we can treat as a nuisance
parameter.
If we define = [1 | 2] and T = [|T], we can write the usual least squares
equations
T̂ = T (2.2)
as a system of two matrix equations
T1 1 ̂ + T1 2 ̂ = T1
T2 1 ̂ + T2 2 ̂ = T2 .
Assuming (T1 1)−1 exists, which it does in this case, we can pre-multiply the
first of these equations by T2 1(T1 1)−1 and subtract it from the second
equation to obtain
T2 [ −1(T1 1)−1T1 ]1 ̂ + T2 [ −1(T1 1)−1T1 ]2 ̂
= T2 [ −1(T1 1)−1T1 ] .
Writing 1 = 1(T1 1)−1T1 , we obtain
2.3. REDUCED NORMAL EQUATIONS FOR THE CRD 29
T2 [ −1]1 ̂ + T2 [ −1]2 ̂ = T2 [ −1] . (2.3)
The matrix 1 is a “hat” matrix for a linear model containing only the term ,
and hence 11 = 1 (see MATH2010 or STAT6123). Hence the first term in
(2.3) is zero, and we obtain the reduced normal equations for :
T2 [ −1]2 ̂ = T2 [ −1] . (2.4)
Note that the solutions from (2.4) are not different from the solution to ̂ that
would be obtained from solving (2.2); equation (2.4) is simply a re-expression,
where we have eliminated the nuisance parameter . This fact means that we
rarely need to solve (2.4) explicitly.
Recalling that for a hat matrix, − 1 is idempotent and symmetric (see
MATH2010 or MATH6174), if we define
2|1 = ( −1)2 ,
then we can rewrite equation (2.4) as
T2|12|1 ̂ = T2|1 , (2.5)
which are the normal equations for a linear model with expectation () =
2|1 .
2.3 Reduced normal equations for the CRD
For the CRD discussed in this chapter, T1 1 = , the total number of runs in
the experiment1. Hence (T1 1)−1 = 1/ and 1 = 1, with the ×
matrix with all entries equal to 1.
The adjusted model matrix then has the form
2|1 = ( −1)2
= 2 −
1
2
= 2 −
1
[11| ⋯ |1] . (2.6)
1In later chapters we will see examples where 1 has > 1 columns, and hence T1 1 is a
matrix.
30 CHAPTER 2. COMPLETELY RANDOMISED DESIGNS
That is, every column of 2 has been adjusted by the subtraction of the column
mean from each entry2. Also notice that each row of 2|1 has a row-sum equal
to zero (= 1 −∑=1 /). Hence, 2|1 is not of full column rank, and so the
reduced normal equations do not have a unique solution3.
Although (2.5), and hence, (2.2), have no unique solution4, it can be shown that
both 2|1 and have unique solutions. Hence fitted values ̂ = and the
residual sum of squares
= ( −)
T
( − )
are both uniquely defined for any solution to (2.2). That is, every solution to the
normal equations leads to the same fitted values and residual sum of squares.
In MATH2010 and STAT6123 we fitted models with categorical variables by
defining a set of dummy variables and estimating a reduced model. Here, we will
take a slightly different approach and study which combinations of parameters
from (2.1) are estimable, and in particular which linear combinations of the
treatment parameters we can estimate.
Let’s study equation (2.5) in more detail. We have
T2|12|1 = T2 ( −1)2
= T2 2 −T2 12
= diag() − 1
T
2 2
= diag() − 1
T ,
where T = (1,… , ). Hence, the reduced normal equations become
[diag() − 1
T] ̂ = T2 −
1

T
2 (2.7)
= T2 − ̄.. , (2.8)
where ̄.. = 1 ∑

=1∑

=1 , i.e. the overall average of the observations from
the experiment.
From (2.8) we obtain a system of equations, each having the form
2Often called “column centred”
3If we recalled the material on “dummy” variables from MATH2010 or STAT6123, we
would already have realised this.
4That is, for any two solutions ̃1 and ̃2, ̃1 = ̃1.
2.4. CONTRASTS 31
̂ − ̂ = ̄. − ̄.. , (2.9)
where ̂ = 1 ∑

=1 ̂ and ̄. = 1 ∑

=1 ( = 1,… , ).
These equations are not independent; when multiplied by the , they sum to
zero due to the linear dependency between the columns of 2|1. Hence, there is
no unique solution to ̂ from equation (2.8). However, we can estimate certain
linear combinations of the , called contrasts.
2.4 Contrasts
Definition 2.1. A treatment contrast is a linear combination T with coef-
ficient vector T = (1,… , ) such that T1 = 0; that is, ∑

=1 = 0.
For example, assume we have = 3 treatments, then the following vectors all
define contrasts:
1. T = (1,−1, 0),
2. T = (1, 0,−1),
3. T = (0, 1,−1).
In fact, they define all (32) = 3 pairwise comparisons between treatments. The
following are also contrasts:
4. T = (2,−1,−1),
5. T = (0.5,−1, 0.5),
each comparing the sum, or average, of expected responses from two treatments
to the expected response from the remaining treatment.
The following are not contrasts, as T1 ≠ 0:
6. T = (2,−1, 0),
7. T = (1, 0, 0),
with the final example once again demonstrating that we cannot estimate the
individual .
2.5 Treatment contrast estimators in the CRD
We estimate a treatment contrast T in the CRD via linear combinations of
equations (2.9):
32 CHAPTER 2. COMPLETELY RANDOMISED DESIGNS
∑
=1
̂ −
∑
=1
̂ =
∑
=1
̄. −
∑
=1
̄..
⇒
∑
=1
̂ =
∑
=1
̄. ,
as ∑=1 ̂ = ∑

=1 ̄.. = 0, as ∑

=1 = 0. Hence, the unique estimator of
the contrast T has the form
T =
∑
=1
̄. .
That is, we estimate the contrast in the treatment effects by the corresponding
contrast in the treatment means.
The variance of this estimator is straightforward to obtain:
var(T) =
∑
=1
2 var( ̄.)
= 2
∑
=1
2 / ,
as, under our model assumptions, each ̄. is an average of independent ob-
servations with variance 2. Similarly, from model (2.1) we can derive the
distribution of T as
T ∼ (T, 2
∑
=1
2 /) .
Confidence intervals and hypothesis tests for T can be constructed/conducted
using this distribution, e.g.
• a 100(1 − 2 )% confidence interval:
T ∈
∑
=1
̄. ± −,1−2
√√
⎷

∑
=1
2 / ,
where −,1−2 is the 1 −
2 quantile of a -distribution with − degrees of
freedom and
2.6. ANALYSING CRDS IN R 33
Table 2.2: Pulp experiment: reflectance values (unitless) from four different
operators.
Operator 1 Operator 2 Operator 3 Operator 4
59.8 59.8 60.7 61.0
60.0 60.2 60.7 60.8
60.8 60.4 60.5 60.6
60.8 59.9 60.9 60.5
59.8 60.0 60.3 60.5
2 = 1 −
∑
=1
∑
=1
( − ̄.)2 (2.10)
is the estimate of 2.
• the hypothesis0 ∶ T = 0 against the two-sided alternative1 ∶ T ≠ 0
is rejected using a test of with confidence level 1 − /2 if
|∑=1 ̄.|
√∑=1 2 /
> −,1−2 .
2.6 Analysing CRDs in R
Let’s return to Example 2.1.
knitr::kable(
tidyr::pivot_wider(pulp, names_from = operator, values_from = reflectance)[, -1],
col.names = paste("Operator", 1:4),
caption = "Pulp experiment: reflectance values (unitless) from four different operators."
)
Clearly, we could directly calculate, and then compare, mean responses for each
operator. However, there are (at least) two other ways we can proceed which
use the fact we are fitting a linear model. These will be useful when we consider
more complex models.
1. Using pairwise.t.test.
with(pulp,
pairwise.t.test(reflectance, operator, p.adjust.method = 'none'))
##
## Pairwise comparisons using t tests with pooled SD
34 CHAPTER 2. COMPLETELY RANDOMISED DESIGNS
##
## data: reflectance and operator
##
## 1 2 3
## 2 0.396 - -
## 3 0.084 0.015 -
## 4 0.049 0.008 0.775
##
## P value adjustment method: none
This function performs hypothesis tests for all pairwise treatment com-
parisons (with a default confidence level of 0.95). Here we can see that
operators 1 and 4, 2 and 3, and 2 and 4 have statistically significant dif-
ferences.
2. Using lm and the emmeans package.
pulp.lm <- lm(reflectance ~ operator, data = pulp)
anova(pulp.lm)
## Analysis of Variance Table
##
## Response: reflectance
## Df Sum Sq Mean Sq F value Pr(>F)
## operator 3 1.34 0.447 4.2 0.023 *
## Residuals 16 1.70 0.106
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
pulp.emm <- emmeans::emmeans(pulp.lm, ~ operator)
pairs(pulp.emm, adjust = 'none')
## contrast estimate SE df t.ratio p.value
## 1 - 2 0.18 0.206 16 0.873 0.3960
## 1 - 3 -0.38 0.206 16 -1.843 0.0840
## 1 - 4 -0.44 0.206 16 -2.134 0.0490
## 2 - 3 -0.56 0.206 16 -2.716 0.0150
## 2 - 4 -0.62 0.206 16 -3.007 0.0080
## 3 - 4 -0.06 0.206 16 -0.291 0.7750
Here, we have first fitted the linear model object. The lm function, by de-
fault, will have set up dummy variables with the first treatment (operator)
as a baseline (see MATH2010 or STAT6123). We then take the interme-
diate step of calculating the ANOVA table for this experiment, and use
an F-test to compare the model accounting for operator differences to the
null model; there are differences between operators at the 5% significance
level,
The choice of dummy variables in the linear model is unimportant; any set
2.7. MULTIPLE COMPARISONS 35
could be used5, as in the next line we use the emmeans function (from the
package of the same name) to specify that we are interested in estimating
contrasts in the factor operator (which specifies our treatments in this
experiment). Finally, the pairs command performs hypothesis tests for
all pairwise comparisons between the four treatments. The results are the
same as those obtained from using pairwise.t.test.
Our preferred approach is using method 2 (lm and emmeans), for four main
reasons:
a. The function contrasts in the emmeans package can be used to estimate
arbitrary treatment contrasts (see help("contrast-methods")).
# same as `pairs` above
emmeans::contrast(pulp.emm, "pairwise", adjust = "none")
## contrast estimate SE df t.ratio p.value
## 1 - 2 0.18 0.206 16 0.873 0.3960
## 1 - 3 -0.38 0.206 16 -1.843 0.0840
## 1 - 4 -0.44 0.206 16 -2.134 0.0490
## 2 - 3 -0.56 0.206 16 -2.716 0.0150
## 2 - 4 -0.62 0.206 16 -3.007 0.0080
## 3 - 4 -0.06 0.206 16 -0.291 0.7750
# estimating single contrast c = (1, -.5, -.5)
# comparing operator 1 with operators 2 and 3
contrast1v23.emmc <- function(levs)
data.frame('t1 v avg t2 t3' = c(1, -.5, -.5, 0))
emmeans::contrast(pulp.emm, 'contrast1v23')
## contrast estimate SE df t.ratio p.value
## t1.v.avg.t2.t3 -0.1 0.178 16 -0.560 0.5830
b. It more easily generalises to the more complicated models we will see in
Chapter 3.
c. It explicitly acknowledges that we have fitted a linear model, and so en-
courages us to check the model assumptions (see Exercise 3).
d. It is straightfoward to apply adjustments for multiple comparisons.
2.7 Multiple comparisons
When we perform hypothesis testing, we choose the critical region (i.e. the rule
that decides if we reject 0) to control the probability of a type I error; that is,
we control the probability of incorrectly rejecting 0. If we need to test multiple
5Recall that although and are not uniquely estimable, fitted values ̂ = ̂ + ̂ are,
and hence it does not matter which dummy variables we use in lm.
36 CHAPTER 2. COMPLETELY RANDOMISED DESIGNS
hypotheses, e.g. to test all pairwise differences, we need to consider the overall
probability of incorrectly rejecting one or more null hypothesis. This is called
the experiment-wise or family-wise error rate.
For Example 2.1, there are (42) = 6 pairwise comparisons. Under the assumption
that all tests are independent6, assuming each individual test has type I error
0.05, the experiment-wise type I error rate is given by:
alpha <- 0.05
1 - (1 - alpha)^6
## [1] 0.265
An experiment-wise error rate of 0.265 is substantially greater than 0.05. Hence,
we would expect to make many more type I errors than may be desirable. xkcd
has a fun example:
alpha <- 0.05
1 - (1 - alpha)^20
## [1] 0.642
Therefore it is usually desirable to maintain some control of the experiment-wise
type I error rate. We will consider two methods.
1. The Bonferroni method. An upper bound on the experiment-wise type
I error rate for testing hypotheses can be shown to be
(wrongly reject at least one of 10 ,… ,0 ) = (
⋃
=1
{wrongly reject 0})
≤
∑
=1
(wrongly reject 0)⏟⏟⏟⏟⏟⏟⏟⏟⏟
≤
≤ .
Hence a conservative7 adjustment for multiple comparisons is to test each
hypothesis at size /, i.e. for the CRD compare to the quantile −,1− 2
(or multiply each individual p-value by ).
For Example 2.1, we can test all pairwise comparisons, each at size /
using the adjustment argument in pairs.
pairs(pulp.emm, adjust = 'bonferroni')
## contrast estimate SE df t.ratio p.value
## 1 - 2 0.18 0.206 16 0.873 1.0000
6They aren’t, but it simplifies the maths!
7So the experiment-wise type I error will actually be less than the prescribed
2.7. MULTIPLE COMPARISONS 37
## 1 - 3 -0.38 0.206 16 -1.843 0.5030
## 1 - 4 -0.44 0.206 16 -2.134 0.2920
## 2 - 3 -0.56 0.206 16 -2.716 0.0920
## 2 - 4 -0.62 0.206 16 -3.007 0.0500
## 3 - 4 -0.06 0.206 16 -0.291 1.0000
##
## P value adjustment: bonferroni method for 6 tests
Now, only one comparison is significant at an experiment-wise type I error
rate of = 0.05 (operators 2 and 4).
2. Tukey’s method. An alternative approach that gives an exact
experiment-wise error rate of 100% compares the statistic to a critical
value from the studentised range distribution8, given by 1√2,−,1−
with ,−,1− the 1 − quantile from the studentised range distribution
(available in R as qtukey).
For Example 2.1:
pairs(pulp.emm)
## contrast estimate SE df t.ratio p.value
## 1 - 2 0.18 0.206 16 0.873 0.8190
## 1 - 3 -0.38 0.206 16 -1.843 0.2900
## 1 - 4 -0.44 0.206 16 -2.134 0.1840
## 2 - 3 -0.56 0.206 16 -2.716 0.0660
## 2 - 4 -0.62 0.206 16 -3.007 0.0380
## 3 - 4 -0.06 0.206 16 -0.291 0.9910
##
## P value adjustment: tukey method for comparing a family of 4 estimates
The default adjustment in the pairs function is the Tukey method. Comparing
the p-values for each comparison using unadjusted t-tests, the Boneferroni and
Tukey methods:
pairs.u <- pairs(pulp.emm, adjust = 'none')
pairs.b <- pairs(pulp.emm, adjust = 'bonferroni')
pairs.t <- pairs(pulp.emm)
data.frame(transform(pairs.b)[, 1:5], Bonf.p.value = transform(pairs.b)[, 6], Tukey.p.value = transform(pairs.t)[, 6], unadjust.p.value = transform(pairs.u)[, 6])
## contrast estimate SE df t.ratio Bonf.p.value Tukey.p.value
## 1 1 - 2 0.18 0.206 16 0.873 1.0000 0.8185
## 2 1 - 3 -0.38 0.206 16 -1.843 0.5034 0.2903
## 3 1 - 4 -0.44 0.206 16 -2.134 0.2918 0.1845
8Given two independent samples 1,… , and 1,… , from the same distribution, the
studentised range distribution is the distribution of √2 , where = − is the range
of the first sample, and = √ 1−1 ∑

=1( − ̄)2 be the sample standard deviation of the
second sample.
38 CHAPTER 2. COMPLETELY RANDOMISED DESIGNS
## 4 2 - 3 -0.56 0.206 16 -2.716 0.0915 0.0658
## 5 2 - 4 -0.62 0.206 16 -3.007 0.0501 0.0377
## 6 3 - 4 -0.06 0.206 16 -0.291 1.0000 0.9911
## unadjust.p.value
## 1 0.39551
## 2 0.08389
## 3 0.04864
## 4 0.01525
## 5 0.00835
## 6 0.77476
Although the decision on which hypotheses to reject (comparson 2 - 4) is the
same here for both methods, the p-values from the Bonferroni method are all
larger, reflecting its more conservative nature.
2.8 Impact of design choices on estimation
Recall from Section 2.5 that the width of a confidence interval for a contrast
T is given by 2−,1−2 √∑
=1 2 /. The expectation of the square of this
quantity is given by
42−,1−2
2
∑
=1
2 / ,
as (2) = 2. It is intuitive that a good design should have small values of the
square root of this quantity (divided by 2),
−,1−2
√√
⎷
∑
=1
2 / ,
which can be achieved either by increasing , and hence reducing the size of the
-quantile, or for choice of the for a fixed , i.e. through choice of replication
of each treatment.
2.8.1 Optimal treatment allocation
It is quite common that although the total number, , of runs in the experiment
may be fixed, the number 1, 2,… , applied to the different treatments is
under the experimenter’s control. Choosing 1, 2 subject to 1 + 2 = was
the first optimal design problem we encountered in Chapter 1.
Assume interest lies in estimating the set of contrasts T1 ,… , T , with T =
(1,… , ). One useful measure of the overall quality of the estimators of these
contrasts is the average variance, given by
2.8. IMPACT OF DESIGN CHOICES ON ESTIMATION 39
2
∑
=1
∑
=1
2/ .
So we will minimise this variance by allocating larger values of to the treat-
ments with correspondingly larger values of the contrast coefficients . There-
fore an approach to optimal allocation is to choose = (1,… , )T so as to
minimise () =
∑
=1
∑
=1
2/ subject to
∑
=1
= . (2.11)
This is a discrete optimisation problem (the are integers). It is usually easier
to solve the relaxed problem, where we allow continuous 0 ≤ ≤ , and round
the resulting solution to obtain integers. There is no guarantee that such a
rounded allocation will actually be the optimal integer-valued solution, but it
is usually fairly close.
To solve the continuous version of (2.11) we will use the method of Lagrange
mutliplers, where we define the function
ℎ(, ) = () + (
∑
=1
− ) ,
introducing the new scalar variable , and solve the set of + 1 equations:
ℎ
1
= 0
⋮
ℎ
= 0
ℎ
= 0 .
In this case, we have
ℎ
= −
∑
=1
2/2 + = 0 , = 1,… , (2.12)
and
ℎ
=
∑
=1
− = 0 .
This last equation ensures ∑=1 = . From the equations described by
(2.12), we get
40 CHAPTER 2. COMPLETELY RANDOMISED DESIGNS
∝
√√√
⎷
∑
=1
2
We don’t need to explicitly solve for to find the normalising constant for each
. As we know ∑
=1 = , we obtain,
=
√∑=1 2
∑=1√∑

=1 2
. (2.13)
Let’s return to Example 2.1 and calculate the optimal allocations under two
different sets of contrasts. First, we define an R function for calculating (2.13).
opt_ni <- function(C, n) {
CtC <- t(C) %*% C
n * sqrt(diag(CtC)) / sum(sqrt(diag(CtC)))
}
Checking that the function opt-ni matches (2.13) is left as an exercise.
Consider two sets of contrasts:
1. All pairwise comparisons between the four treatments
1 = (−1, 1, 0, 0)
2 = (−1, 0, 1, 0)
3 = (−1, 0, 0, 1)
4 = (0,−1, 1, 0)
5 = (0,−1, 0, 1)
6 = (0, 0,−1, 1) .
Calculating (2.13), we obtain
C <- matrix(
c(
-1, 1, 0, 0,
-1, 0, 1, 0,
-1, 0, 0, 1,
0, -1, 1, 0,
0, -1, 0, 1,
0, 0, -1, 1),
nrow = 6, byrow = T
)
opt_ni(C, 20)
## [1] 5 5 5 5
2.8. IMPACT OF DESIGN CHOICES ON ESTIMATION 41
Hence confirming that equal replication of the treatments is optimal for
minimising the average variance of estimators of the pairwise treatment
differences.
2. If operator 4 is new to the mill, it may be desired to test their output
to the average output from the other three operators, using a contrast
with coefficients = (1/3, 1/3, 1/3,−1). The allocation to minimise the
variance of the corresponding estimator is given by:
C <- matrix(
c(1/3, 1/3, 1/3, -1),
nrow = 1
)
opt_ni(C, 20)
## [1] 3.33 3.33 3.33 10.00
So the optimal allocation splits 10 units between operators 1-3, and allo-
cates 10 units to operator 4. There is no exact integer rounding possible,
so we will use 1 = 4, 2 = 3 = 3, 4 = 10 and calculate the efficiency by
comparing the variance of this allocation to that from the equally allocated
design.
crd_var <- function(C, n) {
CtC <- t(C) %*% C
sum(diag(CtC) / n)
}
n_equal <- rep(5, 4)
n_opt <- c(4, 3, 3, 10)
crd_var(C, n_opt) / crd_var(C, n_equal)
## [1] 0.757
So the efficiency of the equally allocated design for estimating this contrast
is 75.69 %.
2.8.2 Overall size of the experiment
We can also consider the complementary question: suppose the proportion of
runs that is to be allocated to each treatment has been fixed in advance, what
size of experiment should be performed to meet the objectives? That is, given
a fixed proportion, , of resource to be allocated to the th treatment, so that
= units will be allocated to that treatment, what value of should be
chosen?
One way of thinking about this question is to consider the ratio
42 CHAPTER 2. COMPLETELY RANDOMISED DESIGNS
|T|
√Var(T)
= |
T|
√2 ∑
=1 2 /
= √ |
T|/
√∑=1 2 /
,
which is analogous to the test statistic for 0 ∶ T = 0. For a given value of
the signal-to-noise ratio = |T|/, we can choose to result in a specified
value of = |T|/√Var(T):
= 2∑
=1 2 /
2 .
Returning to Example 2.1, assume are testing a single pairwise comparison and
that we require = 3, so that the null hypothesis would be comfortably rejected
at the 5% level (cf 1.96 for a standard z-test). For equal allocation of the units
to each treatment (1 = ⋯ = 4 = 1/4) and a variety of different values of the
signal-to-noise ratio , we obtained the following optimal experiment sizes:
opt_n <- function(cv, prop, snr, target) target ^ 2 * c(t(cv) %*% diag( 1 / prop) %*% cv) / snr ^ 2
cv <- c(-1, 1, 0, 0)
w <- rep(1/4, 4)
snr <- c(0.5, 1, 1.5, 2, 2.5, 3)
cbind('Signal-to-noise' = snr, 'n' = opt_n(cv, w, snr, 3))
## Signal-to-noise n
## [1,] 0.5 288.0
## [2,] 1.0 72.0
## [3,] 1.5 32.0
## [4,] 2.0 18.0
## [5,] 2.5 11.5
## [6,] 3.0 8.0
So, for example, to achieve = 3 with a signal-to-noise ratio of = 0.5 requires
= 288 runs. As would be expected, the number of runs required to achieve
this value of decreases as the signal-to-noise ration increases. For = 3, only
a very small experiment with = 8 runs is needed.
2.9 Exercises
1. a. For Example 2.1, calculate the mean response for each operator and
show that the treatment differences and results from hypothesis tests
2.9. EXERCISES 43
using the results in Section 2.5 are the same as those found in Section
2.6 using pairwise.t.test, and emmeans.
b. Also check the results in Section 2.7 by (i) adjusting individual p-
values (for Bonferroni) and (ii) using the qtukey command.
Solution
As a reminder, the data from the experiment is as follows.
Operator 1 Operator 2 Operator 3 Operator 4
59.8 59.8 60.7 61.0
60.0 60.2 60.7 60.8
60.8 60.4 60.5 60.6
60.8 59.9 60.9 60.5
59.8 60.0 60.3 60.5
The mean response, and variance, from each treatment is given by
operator n_i mean variance
1 5 60.2 0.268
2 5 60.1 0.058
3 5 60.6 0.052
4 5 60.7 0.047
The sample variance, 2 = 0.106, from (2.10). As ∑=1 2 / = 25 for con-
trast vectors corresponding to pairwise differences, the standard error of each
pairwise difference is given by √ 225 = 0.206. Hence, we can create a table of
pairwise differences, standard errors and test statistics.
contrast estimate SE df t.ratio unadjust.p.value Bonferroni Tukey
1 - 2 0.18 0.206 16 0.873 0.396 1.000 0.819
1 - 3 -0.38 0.206 16 -1.843 0.084 0.503 0.290
1 - 4 -0.44 0.206 16 -2.134 0.049 0.292 0.184
2 - 3 -0.56 0.206 16 -2.716 0.015 0.092 0.066
2 - 4 -0.62 0.206 16 -3.007 0.008 0.050 0.038
3 - 4 -0.06 0.206 16 -0.291 0.775 1.000 0.991
Unadjusted p-values are obtained from the t-distribution, as twice the tail prob-
abilities (2 * (1 - pt(abs(t.ratio), 16))). For Bonferroni, we simply mul-
tiply these p-values by (2) = 6, and then take the minimum of this value and
1. For the Tukey method, we use 1 - ptukey(abs(t.ratio) * sqrt(2), 4,
16) (see ?ptukey).
Alternatively, to test each hypothesis at the 5% level, we can compare each
t.ratio to (i) qt(0.975, 16) = 2.12 (unadjusted); (ii) qt(1 - 0.025/6, 16)
= 3.008 (Bonferroni); or (iii) qtukey(0.95, 4, 16) / sqrt(2) = 2.861.
2. (Adapted from Wu and Hamada, 2009) The bioactivity of four different
drugs , , and for treating a particular illness was compared in a
44 CHAPTER 2. COMPLETELY RANDOMISED DESIGNS
study and the following ANOVA table was given for the data:
Source Degrees of freedom Sums of squares Mean square
Treatment 3 64.42 21.47
Residual 26 62.12 2.39
Total 29 126.54
i. What considerations should be made when assigning drugs to pa-
tients, and why?
ii. Use an -test to test at the 0.01 level the null hypothesis that the
four drugs have the same bioactivity.
iii. The average response from each treatment is as follows: ̄. = 66.10
( = 7 patients), ̄. = 65.75 ( = 8), ̄. = 62.63 ( = 9),
and ̄. = 63.85 ( = 6). Conduct hypothesis tests for all pair-
wise comparisons using the Bonferroni and Tukey methods for an
experiment-wise error rate of 0.05.
iv. In fact, and are brand-name drugs and and are generic
drugs. Test the null hypothesis at the 5% level that brand-name and
generic drugs have the same bioactivity.
Solution
i. Each patient should be randomly allocated to one of the drugs. This is
to protect against possible bias from lurking variables, e.g. demographic
variables or subjective bias from the study administrator (blinding the
study can also help to protect against this).
ii. Test statistic = (Treatment mean square)/(Residual mean square) =
21.47/2.39 = 8.98. Under 0: no difference in bioactivity between the
drugs, the test statistic follows an 3,26 distribution, which has a 1%
critical value of qf(0.99, 3, 26) = 4.637. Hence, we can reject 0.
iii. For each difference, the test statistic has the form
| ̄. − ̄.|
√ 1 +
1

,
for , = ,,,; ≠ . The treatment means and repetitions are
given in the question (note that not all are equal). From the ANOVA
table, we get 2 = 62.12/26 = 2.389. The following table summarises the
differences between drugs:
− − − − − −
Abs. difference 0.35 3.47 2.25 3.12 1.9 1.22
2.9. EXERCISES 45
− − − − − −
Test statistic 0.44 4.45 2.62 4.15 2.28 1.50
The Bonferroni critical value is 26,1−0.05/12 = 3.507. The Tukey critical
value is 4,26,0.95/
√
2 = 2.743 (available R as qtukey(0.95, 4, 26) /
sqrt(2)). Hence under both methods, bioactivity of drugs and , and
and , are significantly different.
iv. A suitable contrast has = (0.5, 0.5,−0.5,−0.5), with T = (+)/2−
( + )/2 (the difference in average treatment effects).
An estimate for this contrast is given by ( ̄. + ̄.)/2 − ( ̄. + ̄.)/2,
with variance
Var(12( ̄. + ̄.) −
1
2( ̄. +
̄.)) =
2
4 (
1

+ 1
+ 1
+ 1
) .
Hence, a test statistic for 0 ∶ 12 ( + ) − 12 ( + ) = 0 is given by
1
2 ( ̄. + ̄.) − 12 ( ̄. + ̄.)
√ 24 ( 1 +
1
+
1
+
1
)
= 2.685√
2.389
2 √ 17 + 18 + 19 + 16
= 4.70 .
The critical value is 26,1−0.05/2 = 2.056. Hence, we can reject 0 and
conclude there is a difference between brand-name and generic drugs.
3. The below table gives data from a completely randomised design to com-
pare six different batches of hydrochloric acid on the yield of a dye (naph-
thalene black 12B).
napblack <- data.frame(batch = rep(factor(1:6), rep(5, 6)),
repetition = rep(1:5, 6),
yield = c(145, 40, 40, 120, 180, 140, 155, 90, 160, 95,
195, 150, 205, 110, 160, 45, 40, 195, 65, 145,
195, 230, 115, 235, 225, 120, 55, 50, 80, 45)
)
knitr::kable(
tidyr::pivot_wider(napblack, names_from = batch, values_from = yield)[, -1],
col.names = paste("Batch", 1:6),
caption = "Naphthalene black experiment: yields (grams of standard colour) from six different batches of hydrochloric acid."
)
Conduct a full analysis of this experiment, including
a. exploratory data analysis;
46 CHAPTER 2. COMPLETELY RANDOMISED DESIGNS
Table 2.5: Naphthalene black experiment: yields (grams of standard colour)
from six different batches of hydrochloric acid.
Batch 1 Batch 2 Batch 3 Batch 4 Batch 5 Batch 6
145 140 195 45 195 120
40 155 150 40 230 55
40 90 205 195 115 50
120 160 110 65 235 80
180 95 160 145 225 45
b. fitting a linear model, and conducting an F-test to compare to a
model that explains variation using the six batches to the null model;
c. usual linear model diagnostics;
d. multiple comparisons of all pairwise differences between treatments.
Solution
a. Two of the simplest ways of examining the data are to calculate basic
descriptive statistics, e.g. the mean and standard deviation of the yield in
each batch, and to plot the data in the different batches using a simple
graphical display, e.g. a stripchart of the yields in each batch. Notice that
in both aggregate and stripchart we use the formula yield ∼ batch.
This formula splits the data into groups defined by batch.
aggregate(yield ~ batch, data = napblack, FUN = function(x) c(mean = mean(x),
st.dev = sd(x)))
## batch yield.mean yield.st.dev
## 1 1 105.0 63.0
## 2 2 128.0 33.3
## 3 3 164.0 38.0
## 4 4 98.0 68.7
## 5 5 200.0 50.0
## 6 6 70.0 31.0
boxplot(yield ~ batch, data = napblack)
Notice that even within any particular batch, the number of grams of stan-
dard dyestuff colour determined by the dye trial varies from observation
to observation. This within-group variation is considered to be random or
residual variation. This cannot be explained by any differences between
batches. However, a second source of variation in the overall data set can
be explained by variation between the batches, i.e. between the different
batch means themselves. We can see from the box plots (Figure 2.2) and
the mean yields in each batch that observations from batch number five
appear to have given higher yields (in grams of standard colour) than
those from the other batches.
2.9. EXERCISES 47
1 2 3 4 5 6
50
10
0
15
0
20
0
batch
yie
ld
Figure 2.2: Naphthalene black experiment: distributions of dye yields from the
six batches.
b. When we fit linear models and compare them using analysis of variance
(ANOVA), it enables us to decide whether the differences that seem to
be evident in these simple plots and descriptive statistics are statistically
significant or whether this kind of variation could have arisen by chance,
even though there are no real differences between the batches.
An ANOVA table may be used to compare a linear model including dif-
ferences between the batches to the null model. The linear model we will
fit is a simple unit-treatment model:
= + + , = 1,… , 6; = 1,… , 5 , (2.14)
where is the response obtained from the th repetition of the th batch,
is a constant term, is the expected effect due to the observation being
in the th batch ( = 1,… , 5) and are the random errors.
A test of the hypothesis that the group means are all equal is equivalent
to a test that the are all equal to 0 (0 ∶ 1 = 2 = ⋯ = 6 = 0). We
can use lm to fit model (2.14), and anova to test the hypothesis. Before
we fit the linear model, we need to make sure batch has type factor9.
9Factors are variables in R which take on a limited number of different values (e.g. categorical
variables). We need to define a categorical variable, like batch as a factor to ensure they are
treated correctly by functions such as lm.
48 CHAPTER 2. COMPLETELY RANDOMISED DESIGNS
napblack$batch <- as.factor(napblack$batch)
napblack.lm <- lm(yield ~ batch, data = napblack)
anova(napblack.lm)
## Analysis of Variance Table
##
## Response: yield
## Df Sum Sq Mean Sq F value Pr(>F)
## batch 5 56358 11272 4.6 0.0044 **
## Residuals 24 58830 2451
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The p-value of 0.004 indicates significant differences between at least two
of the batch means. Therefore 0 is rejected and a suitable multiple
comparison test should be carried out.
c. To perform our analysis, we have fitted a linear model. Therefore, we
should use some plots of the residuals − ̂ to check the model as-
sumptions, particularly that the errors are independently and identically
normally distributed. The function rstandard which produces residuals
which have been standardised to have variance equal to 1.
standres <- rstandard(napblack.lm)
fitted <- fitted(napblack.lm)
par(mfrow = c(1, 2), pty = "s")
with(napblack, {
plot(batch, standres, xlab = "Batch", ylab = "Standarised residuals")
plot(fitted, standres, xlab = "Fitted value", ylab = "Standarised residuals")
})
The plots (Figure 2.3) show no large standardised residuals (> 2 in ab-
solute value10). While there is some evidence of unequal variation across
batches, there is no obvious pattern with respect to fitted values (e.g. no
“funnelling”).
We can also plot the standardised residuals against the quantiles of a
standard normal distribution to assess the assumption of normality.
par(pty = "s")
qqnorm(standres, main = "")
The points lie quite well on a straight line (see Figure 2.4), suggesting
the assumption of normality is valid. Overall, the residual plots look rea-
sonable; some investigation of transformations to correct for non-constant
variance could be investigated (see MATH2010/STAT6123).
10We would anticipate 95% of the standardised residuals to lie in [-1.96, 1.96], as they will
follow a standard normal distribution if the model assumptions are correct.
2.9. EXERCISES 49
1 2 3 4 5 6
−
2
−
1
0
1
2
Batch
St
an
da
ris
ed
re
sid
ua
ls
80 120 160 200
−
2
−
1
0
1
2
Fitted value
St
an
da
ris
ed
re
sid
ua
ls
Figure 2.3: Residuals against batch (left) and fitted values (right) for the linear
model fit to the naphthalene black data.
−2 −1 0 1 2
−
2
−
1
0
1
2
Theoretical Quantiles
Sa
m
pl
e
Qu
an
tile
s
Figure 2.4: Normal probability plot for the standardised residuals for the linear
model fit to the naphthalene black data.
50 CHAPTER 2. COMPLETELY RANDOMISED DESIGNS
d. When a significant difference between the treatments has been indicated,
the next stage is to try to determine which treatments differ. In some
cases a specific difference is of interest, a control versus a new treatment
for instance, in which case that difference could now be inspected. How-
ever, usually no specific differences are to be considered a priori, and any
difference is of practical importance. A multiple comparison procedure
is required to investigate all possible differences, which takes account of
the number of possible differences available amongst the treatments (15
differences between the six batches here).
We will use Tukey’s method for controlling the experiment-wise type I
error rate, fixed here at 5%, as implemented by emmeans.
napblack.emm <- emmeans::emmeans(napblack.lm, 'batch')
pairs(napblack.emm)
## contrast estimate SE df t.ratio p.value
## 1 - 2 -23 31.3 24 -0.730 0.9760
## 1 - 3 -59 31.3 24 -1.880 0.4350
## 1 - 4 7 31.3 24 0.220 1.0000
## 1 - 5 -95 31.3 24 -3.030 0.0570
## 1 - 6 35 31.3 24 1.120 0.8690
## 2 - 3 -36 31.3 24 -1.150 0.8560
## 2 - 4 30 31.3 24 0.960 0.9270
## 2 - 5 -72 31.3 24 -2.300 0.2330
## 2 - 6 58 31.3 24 1.850 0.4540
## 3 - 4 66 31.3 24 2.110 0.3170
## 3 - 5 -36 31.3 24 -1.150 0.8560
## 3 - 6 94 31.3 24 3.000 0.0610
## 4 - 5 -102 31.3 24 -3.260 0.0350
## 4 - 6 28 31.3 24 0.890 0.9440
## 5 - 6 130 31.3 24 4.150 0.0040
##
## P value adjustment: tukey method for comparing a family of 6 estimates
We have two significant differences, between batches 4-5 and 5-6.
subset(transform(pairs(napblack.emm)), p.value < 0.05)
## contrast estimate SE df t.ratio p.value
## 13 4 - 5 -102 31.3 24 -3.26 0.03482
## 15 5 - 6 130 31.3 24 4.15 0.00429
4. (Adapted from Morris, 2011) Consider a completely randomised design
with = 5 treatments and = 50 units. The contrasts
2 − 1, 3 − 2, 4 − 3, 5 − 4
2.9. EXERCISES 51
are of primary interest to the experimenter.
a. Find an allocation of the 50 units to the 5 treatments, i.e. find 1,… , 5,
that minimises the average variance of the corresponding contrast estima-
tors.
b. Fixing the proportions of experimental effort applied to each treatment to
those found in part (a), i.e. to = /50, find the value of required
to make the ratio = |T|/√var(T) = 2 assuming a signal-to-noise
ratio of 1.
Solution
a. We can use the function opt_ni given in Section 2.8.1:
n <- 50
C <- matrix(
c(
-1, 1, 0, 0, 0,
0, -1, 1, 0, 0,
0, 0, -1, 1, 0,
0, 0, 0, -1, 1
), nrow = 4, byrow = T
)
opt_ni(C, n)
## [1] 8.01 11.33 11.33 11.33 8.01
Rounding, we obtain a solution of the form 1 = 5 = 8, 2 = 4 = 11
and 3 = 12. Any of 2, 3, 4 may be rounded up to 12 to form a design
with the same variance.
nv <- c(8, 11, 11, 11, 8)
crd_var(C, nv + c(0, 1, 0, 0, 0))
crd_var(C, nv + c(0, 0, 1, 0, 0))
crd_var(C, nv + c(0, 0, 0, 1, 0))
## [1] 0.78
## [1] 0.78
## [1] 0.78
b. The optimal ratios for each treatment from part (a) are 1 = 5 = 0.16
and 2 = 3 = 4 = 0.227. Fixing these, we can use code from Section
2.8.2 to find the required value of for each contrast.
nv <- NULL
for(i in 1:4) nv[i] <- opt_n(C[i, ], opt_ni(C, n) / n, 1, 2) # snr = 1, target = 2
nv
## [1] 42.6 35.3 35.3 42.6
52 CHAPTER 2. COMPLETELY RANDOMISED DESIGNS
Hence, we need = 43 for to achieve = 2 for the first and last contrasts,
and = 36 for the second and third. The differences are due to the
different proportions assumed for each treatment. To achieve = 2
for all contrasts, we pick the larger number, = 43.
Chapter 3
Blocking
The completely randomised design (CRD) works well when there is sufficient ho-
mogeneous experimental units to perform the whole experiment under the same,
or very similar, conditions and there are no restrictions on the randomisation
of treatments to units. The only systematic (non-random) differences in the
observed responses result from differences between the treatments. While such
designs are commonly and successfully used, especially in smaller experiments,
their application can often be unrealistic or impractical in many settings.
A common way in which the CRD fails is a lack of sufficiently similar experimen-
tal units. If there are systemtic differences between different batches, or blocks
of units, these differences should be taken into account in both the allocation of
treatments to units and the modelling of the resultant data. Otherwise, block-
to-block differences may bias treatment comparisons and/or inflate our estimate
of the background variability and hence reduce our ability to detect important
treatment effects.
Example 3.1. Steel bar experiment (Morris, 2011, ch. 4)
Kocaoz et al. (2005) described an experiment to assess the strength of steel rein-
forcement bars from = 4 coatings1 (treatments). In total = 32 different bars
(units) were available, but the testing process meant sets of four bars were tested
together. To account for potential test-specific features (e.g. environmental or
operational), these different test sets were assumed to form = 8 blocks of size
= 4. The data are shown in Table 3.1 below.
bar <- data.frame(coating = rep(factor(1:4), 8),
block = rep(factor(1:8), rep(4, 8)),
strength = c(136, 147, 138, 149, 136, 143, 122, 153, 150, 142, 131, 136,
155, 148, 130, 129, 145, 149, 136, 139, 150, 149, 147, 144,
1Thr four coatings were all made from Engineering Thermoplastic Polyurethane (ETPU);
coating one was solely made from ETPU, coatings 2-4 had additional glass fibre, carbon fibre
or aramid fibre added, respectively.
53
54 CHAPTER 3. BLOCKING
Table 3.1: Steel bar experiment: tensile strength values (kliograms per square
inch, ksi) from steel bars with four different coatings.
Block Coating 1 Coating 2 Coating 3 Coating 4
1 136 147 138 149
2 136 143 122 153
3 150 142 131 136
4 155 148 130 129
5 145 149 136 139
6 150 149 147 144
7 147 150 125 140
8 148 149 118 145
147, 150, 125, 140, 148, 149, 118, 145)
)
knitr::kable(
tidyr::pivot_wider(bar, names_from = coating, values_from = strength),
col.names = c("Block", paste("Coating", 1:4)),
caption = "Steel bar experiment: tensile strength values (kliograms per square inch, ksi) from steel bars with four different coatings."
)
Here, each block has size 4, which is equal to the number of treatments in the
experiment, and each treatment is applied in each block. This is an example of
a randomised complete block design.
We can study the data graphically, plotting by treatment and by block.
boxplot(strength ~ block, data = bar)
boxplot(strength ~ coating, data = bar)
The box plots within each plot in Figure 3.1 are comparable, as every treatment
has occured with every block the same number of times (once). For example,
when we compare the box plots for treatments 1 and 3, we know each of then
display one observation from each block. Therefore, differences between treat-
ments will not be influenced by large differences between blocks. This balance
makes our analysis more straighforward. By eye, it appears here there may be
differences between both coating 3 and the other three coatings.
Example 3.2. Tyre experiment (Wu and Hamada, 2009, ch. 3)
Davies (1954), p.200, examined the effect of = 4 different rubber compounds
(treatments) on the lifetime of a tyre. Each tyre is only large enough to split
into = 3 segments whilst still containing a representative amount of each
compound. When tested, each tyre is subjected to the same road conditions,
and hence is treated as a block. A design with = 4 blocks was used, as
displayed in Table 3.2.
55
1 2 3 4 5 6 7 8
12
0
13
0
14
0
15
0
block
st
re
ng
th
1 2 3 4
12
0
13
0
14
0
15
0
coating
st
re
ng
th
Figure 3.1: Steel bar experiment: distributions of tensile strength (ksi) from the
eight blocks (top) and the four coatings (bottom).
56 CHAPTER 3. BLOCKING
Table 3.2: Tyre experiment: relative wear measurements (unitless) from tires
made with four different rubber compounds.
Block Compound 1 Compound 2 Compound 3 Compound 4
1 238 238 279
2 196 213 308
3 254 334 367
4 312 421 412
tyre <- data.frame(compound = as.factor(c(1, 2, 3, 1, 2, 4, 1, 3, 4, 2, 3, 4)),
block = rep(factor(1:4), rep(3, 4)),
wear = c(238, 238, 279, 196, 213, 308, 254, 334, 367, 312, 421, 412)
)
options(knitr.kable.NA = '')
knitr::kable(
tidyr::pivot_wider(tyre, names_from = compound, values_from = wear),
col.names = c("Block", paste("Compound", 1:4)),
caption = "Tyre experiment: relative wear measurements (unitless) from tires made with four different rubber compounds."
)
Here, each block has size = 3, which is smaller than the number of treatments
( = 4). Hence, each block cannot contain an application of each treatment.
This is an example of an incomplete block design.
Graphical exploration of the data is a little more problematic in this example.
As each treatment does not occur in each block, box plots such as Figure 3.2
are not as informative. Do compounds three and four have higher average wear
because they were the only compounds to both occur in blocks 3 and 4? Or
do blocks 3 and 4 have a higher mean because they contain both compounds
3 and 4? The design cannot help us entirely disentangle the impact of blocks
and treatments2. In our modelling, we will assume variation should first be
described by blocks (which are generally fixed aspects of the experiment) and
then treatments (which are more directly under the experimenter’s control).
boxplot(wear ~ block, data = tyre)
boxplot(wear ~ compound, data = tyre)
3.1 Unit-block-treatment model
If is the number of times treatment occurs in block , a common statistical
model to describe data from a blocked experiment has the form
2This is our first example of (partial) confounding, which we will see again in Chapters 5
and 6
3.1. UNIT-BLOCK-TREATMENT MODEL 57
1 2 3 4
20
0
25
0
30
0
35
0
40
0
block
w
e
a
r
1 2 3 4
20
0
25
0
30
0
35
0
40
0
compound
w
e
a
r
Figure 3.2: Tyre experiment: distributions of wear from the four blocks (top)
and the four compounds (bottom).
58 CHAPTER 3. BLOCKING
= + + + , = 1,… , ; = 1,… , ; = 1,… , , (3.1)
where is the response from the th application of the th treatment in the th
block, is a constant parameter, is the effect of the th block, is the effect
of treatment , and ∼ (0, 2) are once again random individual effects
from each experimental unit, assumed independent. The total number of runs
in the experiment is given by = ∑=1∑
=1 .
For Example 3.1, there are = 4 experiments, = 8 blocks and each treatment
occurs once in each block, so = 1 for all , . In Example 3.2, there are again
= 4 treatments but now only = 4 blocks and not every treatment occurs in
every block. In fact, we have 11 = 12 = 13 = 1, 14 = 0, 21 = 22 = 24 =
1, 23 = 0, 31 = 33 = 34 = 1, 32 = 0, 41 = 0 and 42 = 43 = 44 = 1.
Writing model (3.1) is matrix form as a partitioned linear model, we obtain
= 1 +1 +2 + , (3.2)
with the -vector of responses, 1 and 2 × and × model matrices
for blocks and treatments, respectively, = (1,… , )T, = (1,… , )T and
the -vector of errors.
In equation(3.2), assuming without loss of generality that runs of the experiment
are ordered by block, the matrix 1 has the form
1 =
⨁
=1
1 =
⎡
⎢⎢
⎣
11 01 ⋯ 01
02 12 ⋯ 02
⋮ ⋱ ⋮
0 0 ⋯ 1
⎤
⎥⎥
⎦
,
where = ∑

=1 , the number of units in the th block. The structure of
matrix 2 is harder to describe so distinctly, but each row includes a single
non-zero entry, equal to one, indicating which treatment was applied in that
run of the experiment. The first 1 rows correspond to block 1, the second 2
to block 2, and so on. We will see special cases later.
3.2 Normal equations
Writing as a partitioned model = + , with = [1|1|2] and T =
[|T|T], the least squares normal equations
T̂ = T (3.3)
can be written as a set of three matrix equations:
3.2. NORMAL EQUATIONS 59
̂ + 1T1̂ + 1T2 ̂ = 1T , (3.4)
T1 1 ̂ + T1 1̂ + T1 2 ̂ = T1 , (3.5)
T2 1 ̂ + T2 1̂ + T2 2 ̂ = T2 . (3.6)
(3.7)
Above, the matrices T1 1 = diag(1,… , ) and T2 2 = diag(1,… , ) have
simple forms as diagonal matrices with entries equal to the size of each block
and the number of replications of each treatment, respectively.
The × matrix = 2 1 is particularly important in block designs, and
is called the incidence matrix. Each of the th row of indicates in which
blocks the th treatment occurs.
We can eliminate the explicit dependence on and to find reduced normal
equations for by multiplying the middle equation by T2 1(T1 1)−1:
T2 1(T1 1)−1T1 1 ̂+T2 1(T1 1)−1T1 1̂+T2 1(T1 1)−1T1 2 ̂
= T2 1(T1 1)−1T1 1 ̂ + T2 1̂ + T2 1(T1 1)−1T1 2 ̂
= T2 1(T1 1)−1T1
(3.8)
and subtracting from the final equation:
T2 (1 −1(T1 1)−1T1 1) ̂ + (T2 1 −T2 1)
+ T2 ( −1(T1 1)−1T1 )2
= T2 ( −1(T1 1)−1T1 ) . (3.9)
Clearly, a zero matrix is multiplying the block effects . Also,
1(T1 1)−1T1 1 = 1 ,
as
1(T1 1)−1 =
⨁
=1
1
1 =
⎡
⎢
⎢
⎣
1
1 11 01 ⋯ 01
02
1
2 12 ⋯ 02
⋮ ⋱ ⋮
0 0 ⋯
1
1
⎤
⎥
⎥
⎦
,
and hence
60 CHAPTER 3. BLOCKING
1(T1 1)−1T1 =
⨁
=1
1
=
⎡
⎢
⎢
⎣
1
1 1 01×2 ⋯ 01×
02×1
1
2 2 ⋯ 02×
⋮ ⋱ ⋮
0×1 0×2 ⋯
1
⎤
⎥
⎥
⎦
.
Writing 1 = 1(T1 1)−1T1 , we then get the reduced normal equations for
:
T2 ( −1)2 = T2 ( −1) . (3.10)
We can demonstrate the form of these matrices through our two examples.
For Example 3.1:
one <- rep(1, 4)
X1 <- kronecker(diag(1, nrow = 8), one)
X2 <- diag(1, nrow = 4)
X2 <- do.call("rbind", replicate(8, X2, simplify = FALSE))
#incidence matrix
N <- t(X2) %*% X1
X1tX1 <- t(X1) %*% X1 # diagonal
X2tX2 <- t(X2) %*% X2 # diagonal
H1 <- X1 %*% solve(t(X1) %*% X1) %*% t(X1)
ones <- H1 %*% rep(1, 32) # H1 times vector of 1s is also a vector of 1s
A <- t(X2) %*% X2 - t(X2) %*% H1 %*% X2 # X2t(I - H1)X2
qr(A)$rank # rank 3
X2tH1 <- t(X2) %*% H1 # adjustment to y
W <- cbind(ones, X1, X2) # overall model matrix
qr(W)$rank # rank 11 (t+b - 1)
For Example 3.2:
one <- rep(1, 3)
X1 <- kronecker(diag(1, nrow = 4), one)
X2 <- matrix(
c(1, 0, 0, 0,
0, 1, 0, 0,
0, 0, 1, 0,
1, 0, 0, 0,
0, 1, 0, 0,
0, 0, 0, 1,
1, 0, 0, 0,
0, 0, 1, 0,
0, 0, 0, 1,
0, 1, 0, 0,
0, 0, 1, 0,
3.3. ANALYSIS OF VARIANCE 61
0, 0, 0, 1), nrow = 12, byrow = T
)
#incidence matrix
N <- t(X2) %*% X1
X1tX1 <- t(X1) %*% X1 # diagonal
X2tX2 <- t(X2) %*% X2 # diagonal
H1 <- X1 %*% solve(t(X1) %*% X1) %*% t(X1)
ones <- H1 %*% rep(1, 12) # H1 times vector of 1s is also a vector of 1s
A <- t(X2) %*% X2 - t(X2) %*% H1 %*% X2 # X2t(I - H1)X2
qr(A)$rank # rank 3
X2tH1 <- t(X2) %*% H1 # adjustment to y
W <- cbind(ones, X1, X2) # overall model matrix
qr(W)$rank # rank 7 (t+b - 1)
Notice that if we write 2|1 = (−1)2, then the reduced normal equations
become
T2|12|1 = T2|1 ,
which have the same form as the CRD in Chapter 2 albeit with a different 2|1
matrix as we are adjusting for more complex nuisance parameters.
In general, the solution of these equations will depend on the exact form of the
design. For the randomised complete block design, the solution turns out to
be straighforward (see Section @ref(#rcdb) below). By default, to fit model
(3.2), the lm function in R applies the constraint = = 0, and removes the
corresponding columns from 1 and 2, to leave a matrix with full column
rank. Clearly, this solution is not unique but, as with CRDs, we will identify
uniquely estimatable combinations of the model parameters (and use emmeans
to extract these estimates from an lm object).
3.3 Analysis of variance
As was the case with the CRD, it can be shown that any solution to the normal
equations (3.3) will produce a unique solution to , and hence a unique
analysis of variance decomposition can be obtained.
For a block experiment, the ANOVA table is comparing the full model (3.2),
the model containing the block effects
= 1 +1 + (3.11)
and the null model
62 CHAPTER 3. BLOCKING
= 1 + , (3.12)
and has the form:
Source Degrees of freedom Sums of squares Mean square
Blocks − 1 RSS (3.12) - RSS
(3.11)
Treatments − 1 RSS (3.11) - RSS
(3.2)
[RSS (3.11) -
RSS (3.2)] /
( − 1)
Residual − − + 1 RSS (3.2) RSS (3.2) /
( − − + 1)
Total − 1 RSS (3.12)
We test the hypothesis 0 ∶ 1 = ⋯ = = 0 at the 100% significance level
by comparing the ratio of treatment and residual mean squares to the 1 −
quantile of an distribution with − 1 and − − + 1 degrees of freedom.
For Example 3.1, we obtain the following ANOVA.
bar.lm <- lm(strength ~ block + coating, data = bar)
anova(bar.lm)
## Analysis of Variance Table
##
## Response: strength
## Df Sum Sq Mean Sq F value Pr(>F)
## block 7 215 31 0.55 0.7903
## coating 3 1310 437 7.75 0.0011 **
## Residuals 21 1184 56
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Clearly, the null hypothesis of no treatment effect is rejected. The anova func-
tion also compares the block mean square to the residual mean square to perform
a test of the hypothesis 0 ∶ 1 = ⋯ = = 0. This is not a hypothesis that
should usually be tested. The blocks are a nuisance factor and are generally a
feature of the experimental process that has not been subject to randomisation;
we are not interested in testing for block-to-block differences.3
For Example 3.2, we get the ANOVA table:
tyre.lm <- lm(wear ~ block + compound, data = tyre)
anova(tyre.lm)
3R and anova don’t, of course, know that this is a block design or that a blocking factor is
being tested.
3.4. RANDOMISED COMPLETE BLOCK DESIGNS 63
## Analysis of Variance Table
##
## Response: wear
## Df Sum Sq Mean Sq F value Pr(>F)
## block 3 39123 13041 37.2 0.00076 ***
## compound 3 20729 6910 19.7 0.00335 **
## Residuals 5 1751 350
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Again, the null hypothesis is rejected, and hence we should investigate which
tyre compounds differ in their mean response.
The residual mean square for model (3.2) also provides an unbiased estimate,
2, of 2, the variability of the , assuming the unit-block-treatment model is
correct. For Example 3.1, 2 = 56.387 and for Example 3.2, 2 = 350.183.
3.4 Randomised complete block designs
A randomised complete block design (RCBD) has each treatment replicated
exactly once in each block, that is = 1 for = 1,… , ; = 1,… , . Therefore
each block has common size 1 = ⋯ = = . The treatments are randomised
to the units in each block. We can drop the index from our unit-block-
treatment model, as every treatment is replicated just once:
= + + + , = 1,… , ; = 1,… , .
For an RCBD, the matrix 2|1 has the form
2|1 = ( −1)2
= 2 −12
= 2 −
1
× , (3.13)
following from the fact that
64 CHAPTER 3. BLOCKING
12 = 1(T1 1)−1T1 2 (3.14)
= 11
T
1 2 (3.15)
= 11
T (3.16)
= 11× (3.17)
= 1 × , (3.18)
as for a RCBD T1 1 = diag(1,… , ) = and T2 1 = = ×.
Comparing (3.13) to the form of 2|1 for a CRD, equation (2.6), we see that for
the RCBD, 2|1 has the same form as a CRD with replicates of each treatment
(that is, = for = 1,… , ). This is a powerful result, as it tell us
• The reduced normal equations for the RCBD take the same form as for
the CRD,
̂ − ̂ = ̄. − ̄.. ,
with ̂ = 1 ∑
=1 ̂, ̄. = 1 ∑
=1 and ̄.. = 1 ∑
=1∑
=1 . Hence,
as with a CRD, we can estimate any contrast T , having ∑=1 = 1,
with estimator
T =
∑
=1
̄. .
Hence, the point estimate for a contrast T is exactly the same as
would be obtained by ignoring blocks and treating the experiment as a
CRD with = and = , for = 1,… , .
• Inference for a contrast takes exactly the same form as for a CRD (Section
2.5), with in particular:
var(T) =
2
∑
=1
2 ,
and
T ∼ (T,
2
∑
=1
2) .
3.4. RANDOMISED COMPLETE BLOCK DESIGNS 65
Although these equations have the same form as for a CRD, note that 2 is
representing different quantities in each case.
• In a CRD, 2 is the uncontrolled variation in the response among all
experimental units.
• In a RCBD, 2 is the uncontrolled variation in the response among all
units within a common block.
Block-to-block differences are modelled via inclusion of the block effects in
the model, and hence if blocking is effective, we would expect 2 from a RCBD
to be substantially smaller than from a corresponding CRD with = .
Example 3.1 is a RCBD. We can estimate the contrasts
1 − 21 − 31 − 4
between coatings4 using emmeans.
bar.emm <- emmeans::emmeans(bar.lm, ~ coating)
contrastv1.emmc <- function(levs)
data.frame('t1 v t2' = c(1, -1, 0, 0), 't1 v t3' = c(1, 0, -1, 0),
't1 v t4' = c(1, 0, 0, -1))
emmeans::contrast(bar.emm, 'contrastv1')
## contrast estimate SE df t.ratio p.value
## t1.v.t2 -1.25 3.75 21 -0.330 0.7420
## t1.v.t3 15.00 3.75 21 4.000 0.0010
## t1.v.t4 4.00 3.75 21 1.070 0.2990
##
## Results are averaged over the levels of: block
It is important to once again adjust for mulitple comparisons. Here we can use
a Bonferroni adjustment, and multiply each p-value by the number of tests (3).
We obtain p-values of 1 (coating 1 versus 2), 0.002 (1 versus 3) and 0.896 (2
versus 3). Hence, there is a significant difference between coatings 1 and 3, with
0 ∶ 1 = 3 rejected at the 1% significant level.
We can demonstrate the equivalence of the contrast point estimates between a
RCBD and a CRD by fitting a unit-treatment model that ignores blocks:
bar_crd.lm <- lm(strength ~ coating, data = bar)
bar_crd.emm <- emmeans::emmeans(bar_crd.lm, ~ coating)
emmeans::contrast(bar_crd.emm, 'contrastv1')
## contrast estimate SE df t.ratio p.value
## t1.v.t2 -1.25 3.53 28 -0.350 0.7260
4These contrasts measure the difference between the coating only made from ETPU and
the three coatings with added fibres.
66 CHAPTER 3. BLOCKING
## t1.v.t3 15.00 3.53 28 4.240 <.0001
## t1.v.t4 4.00 3.53 28 1.130 0.2670
As expected the point estimates of the three contrasts are identical. In this case,
the standard error of each contrast is actually smaller assuming a CRD without
blocks, suggesting block-to-block differences were actually small here (further
evidence is provided by the small block sums of squares in the ANOVA table).
Here the estimate of from the RCBD is = 7.509 and from the CRD is
= 7.07, so for this example the unit-to-unit variation within and between
blocks is not so different, and actually estimated to be slightly smaller in the
CRD5.
3.5 Orthogonal blocking
The equality of the point estimates from the RCBD and the CRD is a conse-
quence of the block and treatment parameters in model (3.1) being orthogonal.
That is, the least squares estimators for and are independent in the sense
that the estimators obtained from model (3.2) are the same as those obtained
from the sub-models
= 1 +1 + ,
and
= 1 +2 + .
That is, the presence or absence of the block parameters does not affect the
estimator of the treatment parameters (and vice versa).
A condition for and to be estimated orthogonally can be derived from normal
equations (3.4) - (3.5). Firstly. we premultiply (3.4) by 1T1 1 and substract
it from (3.5):
(T1 1 −T1 1) ̂ + (T1 1 −
1
T
1 11T1) ̂ + (T1 2 −
1
T
1 11T2) ̂
= T1 ( −
1
)1̂ +
T
1 ( −
1
)2 ̂
= T1 ( −
1
) . (3.19)
Secondly, we premultiply (3.4) by 1T2 1 and substract it from (3.6):
5Of course, the CRD has seven more degrees of freedom for estimating 2 as block effects
do not require estimation.
3.6. BALANCED INCOMPLETE BLOCK DESIGNS 67
T2 ( −
1
)1̂ +
T
2 ( −
1
)2 ̂ =
T
2 ( −
1
) . (3.20)
For equations (3.19) and (3.20) to be independent, we require
T2 ( −
1
)1 = 0× .
Hence, we obtain the following condition on the incidence matrix = T2 1
for a block design to be orthogonal:
= 1
T
2 1 (3.21)
= 1
T , (3.22)
where T = (1,… , ) is the vector of treatment replications and
T = (1,… , ) is the vector of block sizes.
The most common orthogonal block design for unstructured treatments is the
RCBD, which has = , = 1, = 1, and
= × =
1
T . (3.23)
Hence, the condition for orthogonality is met. In an orthogonal design, such
as a RCBD, all information about the treatment comparisons is contained in
comparisons made within blocks. For more complex blocking structures, such as
incomplete block designs, this is not the case. We shall see orthogonal blocking
again in Chapter 5.
3.6 Balanced incomplete block designs
When the blocks sizes are less than the number of treatments, i.e. < for all
= 1,… , , by necessity the design is incomplete, in that not all treatments can
be allocated to every block. We will restrict ourselves now to considering binary
designs with common block size < . In a binary design, each treatment occurs
within a block either 0 or 1 times ( = 0 or = 1).
Example 3.2 is an example of an incomplete design with = 3 < = 4. For
incomplete designs, it is often useful to study the treatment concurrence matrix,
given by T.
68 CHAPTER 3. BLOCKING
N <- matrix(
c(1, 1, 1, 0,
1, 1, 0, 1,
1, 0, 1, 1,
0, 1, 1, 1),
nrow = 4, byrow = T
)
N %*% t(N)
## [,1] [,2] [,3] [,4]
## [1,] 3 2 2 2
## [2,] 2 3 2 2
## [3,] 2 2 3 2
## [4,] 2 2 2 3
This matrix has the number of treatment replications, , on the diagonal and
the off-diagonal elements are equal to the number of blocks within which each
pair of treatments occurs together. We will denote as the number of blocks
that contain both treatment and treatment . For Example 3.2, = 2 for
all , = 1,… , 4; that is, each pair of treatments occurs together in two blocks.
Definition 3.1. A balanced incomplete block design (BIBD) is an incom-
plete block design with < that meets three requirements:
1. The design is binary.
2. Each treatment is applied to a unit in the same number of blocks. It
follows that the common number of units applied to each treatment must
be = = / ( = 1,… , ), where = . (Sometimes referred to as
first-order balance).
3. Each pair of treatments is applied to two units in the same number of
blocks, that is = . (Sometimes referred to as second-order balance).
In fact, we can deduce that ( − 1) = ( − 1). To see this, focus on
treatment 1. This treatment occurs in blocks, and in each of these blocks,
it occurs together with −1 other treatments. But also, treatment 1 occurs
times with each of the other −1 treatments. Hence (−1) = (−1),
or = ( − 1)/( − 1).
The design in Example 3.2 is a BIBD with = 4, = 3, = 4, = 4 × 3/4 = 3,
= 3 × (3 − 1)/(4 − 1) = 2.
3.6.1 Construction of BIBDs
BIBDs do not exist for all combinations of values of , and . In particular,
we must ensure
• = / is integer, and
3.7. EXERCISES 69
• = ( − 1)/( − 1) is integer.
In general, we can always construct a BIBD for treatments in = () blocks
of size , although it may not be the smallest possible BIBD. Such a design will
have = (−1−1) and = (−2−2). The design in Example 3.2 was constructed this
way, with = 4, = 3 and = 2.
3.6.2 Reduced normal equations
It can be shown that the reduced normal equations (3.10) for a BIBD can be
written as
( −
1
) ̂ =
(
T
2 −
1
T
1 ) . (3.24)
Equation (3.24) defines a series of equations of the form
̂ − ̂ =
(

∑
=1
−
1
∑
=1
∑
=1
) .
Hence, as with the CRD and RCD we can estimate contrasts in the , with
estimator
T =
∑
=1
,
with = ∑
=1 − 1 ∑
=1 ∑
=1 .
3.7 Exercises
1. Consider the below randomised complete block design for comparing two
catalysts, and , for a chemical reaction using six batches of material.
The response is the yield (%) from the reaction.
Catalyst Batch 1 Batch 2 Batch 3 Batch 4 Batch 5 Batch 6
A 9 19 28 22 18 8
B 10 22 30 21 23 12
i. Write down a unit-block-treatment model for this example.
ii. Test if there is a significant difference between catalysts at the 5%
level.
iii. Fit a unit-treatment model ignoring blocks and test again for a differ-
70 CHAPTER 3. BLOCKING
Table 3.5: Blackcurrent experiment: yield (lbs) from six different fertilizers.
Block Ferilizer 1 Ferilizer 2 Ferilizer 3 Ferilizer 4 Ferilizer 5 Ferilizer 6
1 14.5 13.5 11.5 13.0 15 12.5
2 12.0 10.0 11.0 13.0 12 13.5
3 9.0 9.0 14.0 13.5 8 14.0
4 6.5 8.5 10.0 7.5 7 8.0
ence between catalysts. Comment on difference between this analysis
and the one including blocks.
2. Consider the data below obtained from an agricultural experiment in
which six different fertilizers were given to a crop of blackcurrants in a field.
The field was divided into four equal areas of land so that the land in each
area was fairly homogeneous. Each area of land was further subdivided
into six plots and one of the fertilizers, chosen by a random procedure,
was applied to each plot. The yields of blackcurrants obtained from each
plot were recorded (in lbs) and are given in Table 3.5. In this randomised
block design the treatments are the six fertilizers and the blocks are the
four areas of land.
blackcurrent <- data.frame(fertilizer = rep(factor(1:6), 4),
block = rep(factor(1:4), rep(6, 4)),
yield = c(14.5, 13.5, 11.5, 13.0, 15.0, 12.5,
12.0, 10.0, 11.0, 13.0, 12.0, 13.5,
9.0, 9.0, 14.0, 13.5, 8.0, 14.0,
6.5, 8.5, 10.0, 7.5, 7.0, 8.0)
)
knitr::kable(
tidyr::pivot_wider(blackcurrent, names_from = fertilizer, values_from = yield),
col.names = c("Block", paste("Ferilizer", 1:6)),
caption = "Blackcurrent experiment: yield (lbs) from six different fertilizers."
)
Conduct a full analysis of this experiment, including
a. exploratory data analysis;
b. fitting an appropriate linear model, and conducting an F-test to com-
pare a model that explains variation between the six fertilizers to the
model only containing blocks;
c. Linear model diagnostics;
d. if appropriate, multiple comparisons of all pairwise differences be-
tween treatments.
Chapter 4
Factorial experiments
71
72 CHAPTER 4. FACTORIAL EXPERIMENTS
Chapter 5
Blocking in factorial designs
73
74 CHAPTER 5. BLOCKING IN FACTORIAL DESIGNS
Chapter 6
Fractional factorial designs
75
76 CHAPTER 6. FRACTIONAL FACTORIAL DESIGNS
Chapter 7
Response surface
methodology
77
78 CHAPTER 7. RESPONSE SURFACE METHODOLOGY
Chapter 8
Optimal design of
experiments
79
80 CHAPTER 8. OPTIMAL DESIGN OF EXPERIMENTS
Bibliography
Davies, O. L., editor (1954). The Design and Analysis of Industrial Experiments.
Oliver and Boyd, London.
Fisher, R. A. (1935). The Design of Experiments. Oliver and Boyd, Edinburgh.
Kocaoz, S., Samaranayake, V. A., and Nani, A. (2005). Tensile characterization
of glass FRP bars. Composites: Part B, 36:127–134.
Luca, M. and Bazerman, M. H. (2020). The Power of Experiments: Decision
Making in a Data-Driven World. MIT Press, Cambridge.
Morris, M. D. (2011). Design of Experiments: An Introduction based on Linear
Models. Chapman and Hall/CRC Press, Boca Raton.
Wu, C. F. J. and Hamada, M. (2009). Experiments: Planning, Analysis, and
Parameter Design Optimization. Wiley, New York, 2nd edition.