STA304H1S/1003HS Winter 2021: Week 8
We should learn...
What is systematic sampling? It’s advantages and disadvantages...
How does systematic sampling compare with SRS?
What is repeated systematic sampling?
Shivon Sue-Chee Systematic Sampling 1
Systematic Sampling (Ch.7)
choose a random starting point and choose each 10th, or 15th, or
20th observation
Eg, every 10th person leaving a shopping centre
Eg, every 15th item in an ordered list of accounts
Eg, every half hour/fifteen minutes/ ... take an item from an
assembly line for inspection
Eg, our class heights (for n=10)
Shivon Sue-Chee Systematic Sampling 2
Class example
example: our heights (for n=10)
> N=length(height); N
[1] 217
> n=10
> #Systematic random sampling
> k=floor(N/n);k
[1] 21
> #random starting point
> set.seed(1234)
> start=sample(1:k,1); start
[1] 19
> sys<-seq(start,N,k);sys
[1] 19 40 61 82 103 124 145 166 187 208
> sample_sys<-height[sys];
> sample_sys
[1] 170.0 160.0 170.5 180.0 181.0 170.0 175.0 175.0 172.0 180.0
Shivon Sue-Chee Systematic Sampling 3
Advantages and disadvantages of systematic
sampling
easier, especially with no sampling frame
easier to organize, especially with untrained interviewers
more precise if the y ’s in the population are ordered
equivalent to a SRS if the y ’s in the population are in random order
hard to estimate the variance of y¯
so we usually use the SRS variance estimate
very biased if the y ’s in the population are cyclical, and the sampling
interval coincides with the cycle
see Figures 7.1 – 7.3
Shivon Sue-Chee Systematic Sampling 4
Population types
Shivon Sue-Chee Systematic Sampling 5
What are Random, Ordered and Periodic
populations? Examples...
Random population (ρ ≈ 0)
The elements of the population are in random order.
Ordered population (ρ < 0)
The elements of the population have values that trend upward or
downward when listed.
Periodic population (ρ ≈ 1)
The elements of the population tend to cycle upward and downward in a
regular pattern when listed.
Shivon Sue-Chee Systematic Sampling 6
Inference from systematic samples: §7.3, 7.4
comparable to SRS
When systematic sampling is nearly equivalent to SRS, we can
estimate V (y¯sy ) by V̂ (y¯).
see formulas (7.1) and (7.2); (7.5) and (7.6); (7.7) and (7.8)
HW: there’s a (small) mistake in formula (7.3) in the 7th edition:
what is it?
usual technique for estimating sample size n (§7.5)
Otherwise, SRS variance provides useful LB or UB for V (y¯sy ).
Shivon Sue-Chee Systematic Sampling 7
Inference from systematic samples: §7.3, 7.4
comparable to SRS
When systematic sampling is nearly equivalent to SRS, we can
estimate V (y¯sy ) by V̂ (y¯).
see formulas (7.1) and (7.2); (7.5) and (7.6); (7.7) and (7.8)
HW: there’s a (small) mistake in formula (7.3) in the 7th edition:
what is it?
usual technique for estimating sample size n (§7.5)
Otherwise, SRS variance provides useful LB or UB for V (y¯sy ).
Shivon Sue-Chee Systematic Sampling 8
Inference from systematic samples: §7.3, 7.4
comparable to SRS
When systematic sampling is nearly equivalent to SRS, we can
estimate V (y¯sy ) by V̂ (y¯).
see formulas (7.1) and (7.2); (7.5) and (7.6); (7.7) and (7.8)
HW: there’s a (small) mistake in formula (7.3) in the 7th edition:
what is it?
usual technique for estimating sample size n (§7.5)
Otherwise, SRS variance provides useful LB or UB for V (y¯sy ).
Shivon Sue-Chee Systematic Sampling 9
Inference from systematic samples: §7.3, 7.4
comparable to SRS
When systematic sampling is nearly equivalent to SRS, we can
estimate V (y¯sy ) by V̂ (y¯).
see formulas (7.1) and (7.2); (7.5) and (7.6); (7.7) and (7.8)
HW: there’s a (small) mistake in formula (7.3) in the 7th edition:
what is it?
usual technique for estimating sample size n (§7.5)
Otherwise, SRS variance provides useful LB or UB for V (y¯sy ).
Shivon Sue-Chee Systematic Sampling 10
Inference from systematic samples: §7.3, 7.4
comparable to SRS
When systematic sampling is nearly equivalent to SRS, we can
estimate V (y¯sy ) by V̂ (y¯).
see formulas (7.1) and (7.2); (7.5) and (7.6); (7.7) and (7.8)
HW: there’s a (small) mistake in formula (7.3) in the 7th edition:
what is it?
usual technique for estimating sample size n (§7.5)
Otherwise, SRS variance provides useful LB or UB for V (y¯sy ).
Shivon Sue-Chee Systematic Sampling 11
Inference from systematic samples: §7.3, 7.4
When are the SRS formulas okay?
if the sample is ‘similar’ to a simple random sample
i.e. population is unordered, with respect to the variable of interest
SRS variance will be an over-estimate if the population is ordered
SRS estimate will just be wrong if the population is periodic and the
sampling interval coincides with the cycle
e.g. data on store sales taken every 7th day; data on rainfall taken
every 12 months; ...
Shivon Sue-Chee Systematic Sampling 12
Inference from systematic samples: §7.3, 7.4
When are the SRS formulas okay?
if the sample is ‘similar’ to a simple random sample
i.e. population is unordered, with respect to the variable of interest
SRS variance will be an over-estimate if the population is ordered
SRS estimate will just be wrong if the population is periodic and the
sampling interval coincides with the cycle
e.g. data on store sales taken every 7th day; data on rainfall taken
every 12 months; ...
Shivon Sue-Chee Systematic Sampling 13
Inference from systematic samples: §7.3, 7.4
When are the SRS formulas okay?
if the sample is ‘similar’ to a simple random sample
i.e. population is unordered, with respect to the variable of interest
SRS variance will be an over-estimate if the population is ordered
SRS estimate will just be wrong if the population is periodic and the
sampling interval coincides with the cycle
e.g. data on store sales taken every 7th day; data on rainfall taken
every 12 months; ...
Shivon Sue-Chee Systematic Sampling 14
Why is SRS variance estimate too big if the
population is ordered?
By a systematic sample, we estimate a population mean µ by:
µˆ = y¯sy =
∑n
i=1 yi
n
(7.1)
and an estimate of the variance of y¯sy is:
V̂(y¯sy ) =
(
1− n
N
)s2
n
= V̂(y¯)
where s2 = 1n−1
∑n
i=1(yi − µˆ)2.
equivalent to SRS
Shivon Sue-Chee Systematic Sampling 15
Why is SRS variance estimate too big if the
population is ordered?
BUT, true variances are different!
V (y¯sy ) =
=
=
σ2
n
[1 + (n − 1)ρ]
(
1− n
N
) N
N − 1
compared to SRS, where
V (y¯) =
σ2
n
(
1− n
N
) N
N − 1
if ρ < 0,V (y¯sy ) < V (y¯), so V̂ (y¯sy ) is too big!
Shivon Sue-Chee Systematic Sampling 16
... comparing variances
V (y¯sy ) =
σ2
n
[1 + (n − 1)ρ]
(
1− n
N
) N
N − 1
− 1
n − 1 < ρ < 1
ρ is the average of the correlation coefficients between all possible
pairs of observations in the systematic sample of size n
a. Random, ρ ≈ 0 V (y¯sy ) V (y¯) V̂ (y¯)
b. Ordered, ρ < 0 V (y¯sy ) V (y¯) V̂ (y¯)
c. Periodic, ρ ≈ 1 V (y¯sy ) V (y¯) V̂ (y¯)
Shivon Sue-Chee Systematic Sampling 17
Comparing systematic samples: random vs ordered
population
Estimating average height
Refer to class R codes and output
Type Estimate Variance Estimate
a. Random, ρ ≈ 0 V̂ (y¯sy )
b. Ordered, ρ < 0 V̂ (y¯sy )
Shivon Sue-Chee Systematic Sampling 18
Example (Lohr, 5.12)
sampling of dumps and landfills
to see if toxic waste is leaking
from containers
choose a random point in the
landfill area; construct a grid
containing that point
take soil samples from each grid
point
gives good coverage if there is little prior knowledge about where the
toxic materials might be
but could fail if the material is regularly placed
Shivon Sue-Chee Systematic Sampling 19
Example (Lohr, 5.12)
sampling of dumps and landfills
to see if toxic waste is leaking
from containers
choose a random point in the
landfill area; construct a grid
containing that point
take soil samples from each grid
point
gives good coverage if there is little prior knowledge about where the
toxic materials might be
but could fail if the material is regularly placed
Shivon Sue-Chee Systematic Sampling 20
What is Repeated systematic sampling? §7.6
example (Table 7.2): population of N = 960 elements, numbered
consecutively
choose ns = 10 random starting points, take systematic samples of
size 6
Random Second Third Sixth
Sample Starting element element element
Number Point in sample in sample . . . in sample
1 6 166 326 . . . 806
2 17 177 337 . . . 817
3 21 181 341 . . . 821
4 42 202 362 . . . 842
5 73 233 393 . . . 873
6 81 241 401 . . . 881
7 86 246 406 . . . 886
8 102 262 422 . . . 902
9 112 272 432 . . . 912
10 145 305 465 . . . 945
Shivon Sue-Chee Systematic Sampling 21
... repeated systematic sampling
estimate population mean by averaging the row averages
µˆ =
ns∑
i=1
y¯i
ns
(7.12)
where ns is the number of systematic samples
estimate σ2 by the variance across the rows
s2y¯ =
∑ns
i=1(y¯i − µˆ)2
ns − 1 , V̂ (µˆ) =
(
1− n
N
) s2y¯
ns
(7.13)
Why repeated SYS?
No need to make any assumption about the order of the population.
Shivon Sue-Chee Systematic Sampling 22
Homework
Readings: §7.1 – §7.6
Skip §7.7
HW: 7.2, 7.4, 7.5, 7.11, 7.15, 7.19, 7.24
Shivon Sue-Chee Systematic Sampling 23
学霸联盟