STA303H1是一门面向留学生的统计学课程,旨在帮助学生了解统计学的基本原理和方法。在这门课程中,学生将学习如何使用统计数据来解决实际问题,并了解如何通过统计分析来评估假设。
课程内容包括统计推断,概率分布,线性回归模型,分类数据分析等。通过本课程的学习,学生将具备独立设计和进行统计分析的能力,并能够在实际问题中运用所学知识。
我们的培训机构拥有丰富的教学经验和优秀的教师团队,能够为学生提供全面而系统的课程设计和个性化的教学方案。通过我们的课程培训,学生将获得丰富的学术知识和实践能力,为他们未来的学术研究和职业发展打下坚实的基础。
Methods of Data Analysis II
STA303H1-S
Randomized Complete Block Design and Two-Way
Analysis of Variance
Module 4
Dr. Esam Mahdi
February 1, 2023
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023 1
/ 29
Learning Objectives
By the end of this lecture, you should be able to do the following:
1 Understand the randomized block design principle.
2 Understand how to perform two-way analysis of variance (ANOVA).
3 Understand when and how to use Bonferroni and Tukey post hoc tests in
randomized block design ANOVA.
4 Understand when and how to use Bonferroni and Tukey post hoc tests in
two-way ANOVA.
5 Perform randomized complete block design and two-way ANOVA procedures
using R with some real-life applications.
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023 2
/ 29
Randomized Block Design (RBD)
A Blocking is a technique for dealing with nuisance factors.
A nuisance factor is a variable that might has some effect on the response
variable, but it’s of no interest to the experimenter. However, the experimenter
still needs to minimize (control) the effect of this nuisance factor on the
response.
By using the blocking design, the experimenter tries to eliminate and avoid any
interaction between this nuisance factor and the other factor of research interest.
For example, if an experimenter wishes to investigate the impact of different
production methods on the the number of defective light bulbs produced in a day,
then the machine operators might be seen as a nuisance factor.
If the nuisance variable is known and controllable, we use blocking.
If the nuisance factor is unknown and uncontrollable, we hope that
randomization balances out its impact across the treatments.
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023 3
/ 29
What is Randomized Block Design (RBD)?
The individual groups of measurements that are expected to be more homogeneous than the others. Such
groups are called block, and various treatments (factor levels) are assigned randomly to such blocks.
1 Randomized complete block design (RCBD): we have each treatment level occurring one time within
each block. Thus, all cells in the incidence matrix which defines the design of the experiment would
be 1’s indicating that the treatment occurs in that block (nuisance factor) (see for example the
following incidence matrix for complete block design).
Block 1 Block 2 Block 3 Block 4
Treatment 1 1 1 1 1
Treatment 2 1 1 1 1
Treatment 3 1 1 1 1
2 Randomized incomplete block design: we have some blocks without some treatment levels. This
means that if we have a treatments and b blocks with size k (denotes the number of units that each
block can have), then k < a. In this case, the incidence matrix would be 0’s and 1’s indicating
whether or not that treatment occurs in that block (see for example the following incidence matrix for
incomplete block design with a = 3, b = 4 and k = 2).
Block 1 Block 2 Block 3 Block 4
Treatment 1 1 1 0 1
Treatment 2 0 1 1 1
Treatment 3 1 0 1 0
Note: When there are two or more observations per cell, then the design is called a two-way ANOVA.
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023 4
/ 29
Extension of the ANOVA to the RCBD
The model
The ANOVA model can be extended to the additive RCBD statistical model as follows:
yij = µ+ τi + βj + εij , i = 1, 2, · · · , a, j = 1, 2, · · · , b (1)
where yij is the response observation of the ith treatment in the jth block.
µ is the overall mean, where µi = µ+ τi is the mean of the ith treatment.
τi is the additive effect of the ith treatment.
βj is the additive effect of the jth block.
εij is the random error term that is assumed to be normally distributed with mean zero
and constant variance σ2.
The assumptions
Each treatment/block is normally distributed.
The variances on the response variable is constant cross all treatments and blocks.
No interaction between the blocks and treatments (independent effects).
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023 5
/ 29
How to Implement Randomized Complete Block Design
(RCBD)?
Step 1: Divide data into b blocks.
Step 2: Divide each block into a number of units equal to the number of treatments (say a).
Step 3: Within each block, the treatments are assigned at random so that a different treatment is applied to each unit. That
is, all treatments are observed within each block (i.e., each cell has only one observation).
Block 1 Block 2 · · · Block b Treatment means
Treatment 1 y11 y12 · · · y1b y¯1
Treatment 2 y21 y22 · · · y2b y¯2
...
...
...
...
...
...
Treatment a ya1 ya2 · · · yab y¯a
Block means y¯1 y¯2 · · · y¯b
The means of the ith treatment and jth block are given by
y¯i =
1
b
b∑
j=1
yij , and y¯j =
1
a
a∑
i=1
yij , respectively
The grand mean is given by
y¯ = y¯ =
1
a
a∑
i=1
y¯i =
1
b
b∑
j=1
y¯j =
1
ab
a∑
i=1
b∑
j=1
yij
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023 6
/ 29
Sum of Squares for RCBD Model
SStr measures the amount of between treatment variability
SStr = b
a∑
i=1
(y¯i − y¯)2
SSblk measures the amount of variability due to the blocks
SSblk = a
b∑
j=1
(y¯j − y¯)2
SScor measures the total amount of variability
SScor =
a∑
i=1
b∑
j=1
(yij − y¯)2
SSres measures the amount of the variability due to error
SSres = SScor − SStr − SSblk
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023 7
/ 29
ANOVA Table Display for the RCBD
Table: Analysis of variance for RCBD
Source SS df MS F
Treatments
SStr a− 1 MST = SStra−1 Ftr = MSTMSEBlocks SSblk b− 1 MSB = SSblkb−1
Fblk = MSBMSEError SSres (a− 1)× (b− 1) MSE = SSres(a−1)×(b−1)
Total SScor (a× b)− 1 = n− 1
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023 8
/ 29
Test for treatment effects
H0 : No difference between treatment effects (equivalently µ1 = · · · = µa = µ, or H0 : τ1 = · · · = τa = 0)
Ha : At least two treatment effects differ (equivalently µi ̸= µj for some i ̸= j, or H0 : τi = 0 for all i)
The F -test rejects the null hypothesis at the α level significance if
Ftr =
SStr/(a− 1)
SSres/ ((a− 1)× (b− 1))
> Fa−1,(a−1)×(b−1)(α)
where Fa−1,(a−1)×(b−1)(α) is the upper (100α) th percentile of the F -distribution with a− 1 and
(a− 1)× (b− 1) degrees of freedom.
Test for block effects
H0 : No difference between block effects
Ha : At least two block effects differ
The F -test rejects the null hypothesis at the α level significance if
Fblk =
SSblk/(b− 1)
SSres/ ((a− 1)× (b− 1))
> Fb−1,(a−1)×(b−1)(α)
where Fb−1,(a−1)×(b−1)(α) is the upper (100α) th percentile of the F -distribution with b− 1 and
(a− 1)× (b− 1) degrees of freedom.
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023 9
/ 29
Example 1
Suppose an experimenter wishes to study the impact of four different production methods on
the daily production of defective light bulbs. Assume that there are three machine operators and
the experimenter decide to blocking this factor. Thus, we have 12 observations of daily
production of defective light bulbs given in the following table.
Machine Operator (Block)
Production Method (Treatment) 1 2 3 Treatment mean
1 8 12 11 10.3333
2 9 12 10 10.3333
3 3 7 5 5.0000
4 4 5 5 4.6667
Block Mean 6.00 9.00 7.75 y¯ = 7.5833
Perform RCBD to test for
H0 : no differences between the treatment effects vs
Ha : at least two treatment effects differ
and
H0 : no differences between the block effects vs
Ha : at least two block effects differ
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023
10 / 29
Example 1 (cont.) Calculate sum of squares for RCBD manually
Step 1: Calculate SStr , which measures the amount of between-treatment variability:
SStr = b
a∑
i=1
(
y¯i − y¯
)2
= 3
[
(y¯1 −y¯)2 + (y¯2 −y¯)2 + (y¯3 −y¯)2 + (y¯4 −y¯)2
]
= 3
[
(10.3333 − 7.5833)2 + (10.3333 − 7.5833)2 + (5.0 − 7.5833)2 + (4.6667 − 7.5833)2
]
= 90.9167
Step 2: Calculate SSblk , which measures the amount of variability due to the blocks:
SSblk = a
b∑
j=1
(
y¯j − y¯
)2
= 4
[
(y¯1 − y¯)2 + (y¯2 − y¯)2 + (y¯3 − y¯)2
]
= 4
[
(6.0 − 7.5833)2 + (9.0 − 7.5833)2 + (7.75 − 7.5833)2
]
= 18.1667
Step 3: Calculate SScor , which measures the total amount of variability:
SScor =
a∑
i=1
b∑
j=1
(
yij − y¯
)2
= (8 − 7.5833)2 + (12 − 7.5833)2 + (11 − 7.5833)2 + (9 − 7.5833)2 + (12 − 7.5833)2 + (10 − 7.5833)2
+ (3 − 7.5833)2 + (7 − 7.5833)2 + (5 − 7.5833)2 + (4 − 7.5833)2 + (5 − 7.5833)2 + (5 − 7.5833)2
= 112.9167
Step 4: Calculate SSres , which measures the amount of variability due to the error:
SSres = SScor − SStr − SSblk
= 112.9167 − 90.9167 − 18.1667
= 3.8333
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023
11 / 29
Example 1 (cont.) Randomized block ANOVA in R
def_light <- c(8,9,3,4,12,12,7,5,11,10,5,5)
block <- factor(rep(1:3, c(4,4,4)))
levels(block) <- c("1", "2", "3")
prod_method <- factor(rep(1:4, 3))
levels(prod_method) <- letters[1:4]
dat <- data.frame(block, prod_method,def_light)
head(dat) # show the first 6 observations in our data frame
## block prod_method def_light
## 1 1 a 8
## 2 1 b 9
## 3 1 c 3
## 4 1 d 4
## 5 2 a 12
## 6 2 b 12
par(mfrow =c(1 , 2)) # See the plot in the next slide
plot(def_light ~ block + prod_method, data = dat,
ylab = "daily production of defective light bulbs")
par(mfrow =c(1 , 1))
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023
12 / 29
Example 1 (cont.) Box-plot with treatment and block factors
1 2 3
4
6
8
10
12
block
Ho
url
y p
rod
uc
tio
n o
f d
efe
cti
ve
lig
ht
bu
lbs
a b c d
4
6
8
10
12
prod_method
Ho
url
y p
rod
uc
tio
n o
f d
efe
cti
ve
lig
ht
bu
lbs
The box-plots suggest that the levels of both block and treatment factors have different effects
on the mean daily production of defective light bulbs.
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023
13 / 29
Example 1 (cont.) Randomized block ANOVA
anov1 <- aov(def_light ~ block + prod_method, data = dat)
summary(anov1)
## Df Sum Sq Mean Sq F value Pr(>F)
## block 2 18.17 9.083 14.22 0.005290 **
## prod_method 3 90.92 30.306 47.44 0.000143 ***
## Residuals 6 3.83 0.639
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Based on the p-values and a significance level of 5%, we may conclude the following from the ANOVA
results:
The production methods (treatments) p-value is less than 5% (significant), indicating that we have
strong evidence that at least two production methods have different effects on the mean daily
production of defective light bulbs.
The machine operators (blocks) p-value is less than 5% (significant), indicating that we have strong
evidence that at least two machine operators have different effects on the mean daily production of
defective light bulbs.
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023
14 / 29
Example 1 (cont.) Multiple Tukey pairwise comparisons
We can compute the post hoc test for pairwise comparisons between the means of treatments (or blocks),
same as we did in one-way ANOVA. This can be done by considering either Tukey HSD test using the
function TukeyHSD(), or Bonferroni corrected test using the function pairwise.t.test().
The Tukey simultaneous 100(1− α)% confidence interval for µi − µh is given by the formula
y¯i − y¯h ± qα
√
MSE/b, i, h = 1, 2, · · · , a
where qα is the upper α percentage point of the studentized range for a and (a− 1)(b− 1) from the
Studentized Range Table given in Module 3. MSE denotes the mean square errors.
TukeyHSD(anov1, which = "prod_method")
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = def_light ~ block + prod_method, data = dat)
##
## $prod_method
## diff lwr upr p adj
## b-a -1.776357e-15 -2.259217 2.259217 1.0000000
## c-a -5.333333e+00 -7.592550 -3.074117 0.0007493
## d-a -5.666667e+00 -7.925883 -3.407450 0.0005356
## c-b -5.333333e+00 -7.592550 -3.074117 0.0007493
## d-b -5.666667e+00 -7.925883 -3.407450 0.0005356
## d-c -3.333333e-01 -2.592550 1.925883 0.9535148
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023
15 / 29
Example 1 (cont.) Plot the simultaneous confidence intervals using R
plot(TukeyHSD(anov1))
−2 0 2 4
3−
2
3−
1
2−
1
95% family−wise confidence level
Differences in mean levels of block
−8 −6 −4 −2 0 2
d−
c
d−
b
c−
b
d−
a
c−
a
b−
a
95% family−wise confidence level
Differences in mean levels of prod_method
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023
16 / 29
Two-way versus one-way analysis of variance
In one-way ANOVA, we model the response variable by one independent categorical variable (one factor).
In two-way ANOVA, we model the response variable by two independent categorical variables (two factors), where
the interaction between these two factors is taken into account when we model the response variable.
Types of interactions and interpreting interaction plots
An interaction occurs when the effect of one factor (say A) depends on the level of another factor (say B). In this
case, the plots display non-parallel lines.
The more nonparallel the lines are, the greater the strength of the interaction.
If the interaction plot is perfectly parallel then, then the two factors do not interact.
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023
17 / 29
Two-way ANOVA Model
The model
yijk = µ+ αi + βj + (αβ)ij + εijk, (2)
where yijk is the kth observation (k = 1, · · · , n) for the (i, j) treatment, where i denotes the ith level of
factor A and jth level of factor B.
µ is the overall (grand) mean, where µij = µ+ αi + βj + (αβ)ij is the mean of the (i, j)th
treatment.
αi is the main effect for level i of factor A, where i = 1, 2, · · · , a.
βj is the main effect for level j of factor B, where j = 1, 2, · · · , b.
(αβ)ij is the interaction effect between A and B that takes into account both i and j.
The random errors εijk are iid N(0, σ2).
The assumptions
The response variable observations are independent.
The response variable is normally distributed with a mean that may depend on the levels of the
factors A and B, and a constant variance among all treatments.
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023
18 / 29
Data for two-way ANOVA
Factor A Factor B
1 2 · · · b Averages for factor A
1
y111 y121 · · · y1b1
y¯1...
... · · ·
...
y11n y12n · · · y1bn
2
y211 y221 · · · y2b1
y¯2...
... · · ·
...
y21n y22n · · · y2bn
...
...
...
...
...
a
ya11 ya21 · · · yab1
y¯a...
... · · ·
...
ya1n ya2n · · · yabn
Averages for factor B y¯1 y¯2 · · · y¯b
A particular combination of levels is called a treatment or a cell. There are a× b treatment combinations and each
cell has n observations. The total number of observations is N = abn.
The overall (grand) mean µ is estimated by y¯ =
1
abn
a∑
i=1
b∑
j=1
n∑
k=1
yijk .
The effect of the ith level of factor A, αi, is estimated by y¯i − y¯ = 1bn
∑b
j=1
∑n
k=1 yijk − y¯.
The effect of the jth level of factor B, βj , is estimated by y¯j − y¯ = 1an
∑a
i=1
∑n
k=1 yijk − y¯.
The effect of the (i, j)th interaction, (αβ)ij , is estimated by y¯i − y¯j + y¯.
The mean of the observations in cell (i, j) is y¯ij = 1n
∑n
k=1 yijk .
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023
19 / 29
Sum of Squares for Two-way ANOVA - Manual Calculating (ugh!)
Sum of Squares for factor A: Measures variation in the response due to the fact that different levels of factor A.
SSA =
∑
i,j,k
αˆ
2
i =
a∑
i=1
b∑
j=1
n∑
k=1
(y¯i − y¯)2 = bn
a∑
i=1
(y¯i − y¯)2 df = a− 1
Sum of Squares for factor B: Measures variation in the response due to the fact that different levels of factor B.
SSB =
∑
i,j,k
βˆ
2
j =
a∑
i=1
b∑
j=1
n∑
k=1
(y¯j − y¯)2 = an
b∑
j=1
(y¯j − y¯)2 df = b− 1
Interaction Sum of Squares: Measures the variation in the response due to the interaction between factors A and B.
SSAB =
∑
i,j,k
ˆ(αβ)
2
ij = n
∑
i,j
ˆ(αβ)
2
ij =
a∑
i=1
b∑
j=1
n∑
k=1
(y¯ij − y¯i − y¯j + y¯)2 df = (a−1)(b−1)
Error or Residual Sum of Squares: Measures the variation in the response within the a× b factor combinations.
SSE =
∑
i,j,k
εˆ
2
ijk =
a∑
i=1
b∑
j=1
n∑
k=1
(yijk − y¯ij)2 df = ab(n− 1)
Total Sum Squares: Measures the overall variability in the data.
SST = SSA + SSB + SSAB + SSE =
a∑
i=1
b∑
j=1
n∑
k=1
(yijk − y¯)2 df = abn− 1
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023
20 / 29
Two-way ANOVA Table
Table: Two-way analysis of variance
Source SS df MS F
Factor
A SSA a− 1 MST = SSAa−1 FA = MSAMSEFactor B SSB b− 1 MSB = SSBb−1 FB =
MSBMSEInteraction SSAB (a− 1)(b− 1) MSB = SSAB(a−1)(b−1) FAB =
MSABMSEError SSE ab(n− 1) MSE = SSEab(n−1)
Total SST abn− 1
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023
21 / 29
Test for factor A effect
H0 : α1 = α2 = · · · = αa = 0
Ha : αi ̸= 0 for at least one i
The F -test rejects the null hypothesis at the α level significance if FA > Fa−1,ab(n−1)(α).
Test for factor B effect
H0 : β1 = β2 = · · · = βb = 0
Ha : βj ̸= 0 for at least one j
The F -test rejects the null hypothesis at the α level significance if FB > Fb−1,ab(n−1)(α).
Test for interaction effect
H0 : (αβ)ij = 0 for all (i, j)
Ha : (αβ)ij ̸= 0 for at least one (i, j)
The F -test rejects the null hypothesis at the α level significance if FAB > F(a−1)(b−1),ab(n−1)(α).
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023
22 / 29
Example 2
Consider the data set grip available from the package UsingR. The study
investigated the effect of grip type on upper body power. Perform a two way analysis
of variance by using UBP (the measurement of upper-body power) as the response
variable whereas the person (one of four skiers) and grip.type as categorical
independent variables.
library("UsingR")
data(grip)
attach(grip)
# Box plot with two variable factors
boxplot(UBP ~ person * grip.type, frame = FALSE,
col = c("#00AFBB","#A7B800","#33B8FB", "#E7B800"),
xlab="Skiers: Grip type ", ylab="Upper-body power")
# Two-way interaction plot
interaction.plot(x.factor=person,trace.factor=grip.type,
response=UBP,fun = mean, type = "b",
legend = TRUE, xlab = "Skiers",
ylab="Upper-body power",pch=1:4,col = 1:4)
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023
23 / 29
Example 2 (cont.) Box-plot and interaction plot
Figure: Box plot (up) for skiers and grip type variables and mean response of upper-body power
(down) based on the interaction between the levels of skiers and grip type.
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023
24 / 29
Example 2 (cont.) Two-way ANOVA assumed that the two-factor
variables are unrelated (no interaction)
To model the response variable using two-way ANOVA without interaction effect, we use the additive
model using + in the function aov() as follows:
anov2 <- aov(UBP ~ person + grip.type, data = grip)
summary(anov2)
## Df Sum Sq Mean Sq F value Pr(>F)
## person 3 27.4 9.12 0.537 0.660334
## grip.type 2 339.2 169.59 9.995 0.000472 ***
## Residuals 30 509.0 16.97
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We may deduce from the ANOVA table that the grip type is statistically significant and hence its levels have
major impact on the mean value of the upper-body power, whereas the four levels of skiers of the variable
person have no significant on the best power output.
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023
25 / 29
Example 2 (cont.) Two-way ANOVA assumed that the two-factor
variables are interacting
To model the response variable using two-way ANOVA with interaction effect, we use the mixed additive
model using ⋆ in the function aov() as follows:
anov3 <- aov(UBP ~ person * grip.type, data = grip)
summary(anov3)
## Df Sum Sq Mean Sq F value Pr(>F)
## person 3 27.4 9.12 0.452 0.7181
## grip.type 2 339.2 169.59 8.412 0.0017 **
## person:grip.type 6 25.2 4.19 0.208 0.9709
## Residuals 24 483.9 20.16
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Based on the p-values and a significance level of 5%, we may conclude the following from ANOVA:
The grid type p-value is less than 5% (significant), indicating that varied levels of grip type are
associated with varying upper-body power.
The person p-value is greater than 5% (insignificant), indicating that the levels of person have the
same effect with no impact on the upper-body power.
The interaction between grip type and person has a p-value greater than (insignificant), indicating
that the interaction between these factors has no influence on the upper-body power values (no
interaction).
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023
26 / 29
Example 2 (cont.) Multiple Tukey pairwise comparisons
If any of F-tests in ANOVA table is significant, then you can compute the post hoc test for pairwise
comparisons between the means of groups, same as we did in one-way ANOVA. This can be done by
considering either Tukey HSD test using the function TukeyHSD(), or Bonferroni corrected test using the
function pairwise.t.test().
TukeyHSD(anov2, which = "grip.type")
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = UBP ~ person + grip.type, data = grip)
##
## $grip.type
## diff lwr upr p adj
## integrated-classic 5.919209 1.773560 10.064859 0.0038891
## modern-classic -1.055374 -5.201024 3.090275 0.8062100
## modern-integrated -6.974584 -11.120233 -2.828934 0.0007237
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023
27 / 29
Example 2 (cont.) Check for homoscedasticity, normality, influential
points, linearity, and independence of errors
par(mfrow=c(2,2))
plot(anov2)
par(mfrow=c(1,1))
162 164 166 168 170
−
5
0
5
Fitted values
Re
sid
ua
ls
Residuals vs Fitted
9
4
5
−2 −1 0 1 2
−
1
0
1
2
Theoretical Quantiles
Sta
nd
ard
ize
d r
es
idu
als
Normal Q−Q
9
4
5
162 164 166 168 170
0.0
0.5
1.0
1.5
Fitted values
Sta
nd
ar
diz
ed
re
sid
ua
ls
Scale−Location
9
45
−
2
−
1
0
1
2
Factor Level Combinations
Sta
nd
ard
ize
d r
es
idu
als
1 2 3 4
person :
Constant Leverage:
Residuals vs Factor Levels
9
4
5
Figure: Diagnostic Plots for the ANOVA model in Example 2
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023
28 / 29
Example 2 (cont.) Check the homogeneity of variances (Formal test)
The formal test statistic that can be used to check the homogeneity of variances is Levene’s test.
The Levene’s test is implemented in the function leveneTest() from the car package. This
test statistic is used to test:
H0 : The variance on the dependent variable are equal across the groups (homogeneous)
Ha : The variance on the dependent variable across the groups is not constant (heterogeneous)
library(car)
leveneTest(UBP ~ person * grip.type, data = grip)
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 11 0.2223 0.9937
## 24
We fail to reject the null hypothesis at the α = 0.05 significance level since the p-value
0.9937 > 0.05. We conclude that there is insufficient evidence to claim that the variances are
heterogeneous.
Dr.
Esam Mahdi Methods of Data Analysis II STA303H1-S Randomized Complete
Block Design and Two-Way Analysis of Variance Module 4February 1, 2023
29 / 29