程序代写案例-ST3134|学霸联盟

程序代写案例-ST3134

时间：2022-04-28

Examiners’ commentaries 2018
Examiners’ commentaries 2018
ST3134 Advanced statistics: statistical inference
Important note
This commentary reflects the examination and assessment arrangements for this course in the
academic year 2017–18. The format and structure of the examination may change in future years,
and any such changes will be publicised on the virtual learning environment (VLE).
Information about the subject guide and the Essential reading
references
Unless otherwise stated, all cross-references will be to the latest version of the subject guide (2011).
You should always attempt to use the most recent edition of any Essential reading textbook, even if
the commentary and/or online reading list and/or subject guide refer to an earlier edition. If
different editions of Essential reading are listed, please check the VLE for reading supplements – if
none are available, please use the contents list and index of the new edition to find the relevant
section.
General remarks
Learning outcomes
At the end of this half course and having completed the essential reading and activities you should
be able to:
• explain the principles of data reduction
• judge the quality of estimators
• choose appropriate methods of inference to tackle real problems.
Comments on overall performance
The final distribution of marks was an improvement on 2017, although there remained a clear split
between candidates achieving a first class mark and those scoring below 40%. Candidates should pay
close attention to the essential topics of sufficient statistics, point estimation, interval estimation and
hypothesis testing.
Being able to state and prove key theorems and lemmas simply requires close study of these in the
subject guide. The point estimation techniques of method of moments estimation and maximum
likelihood estimation are standard (and are even introduced in ST104b Statistics 2), with
questions typically varying in terms of the probability distribution from which the random sample is
drawn (with many common distributions covered in ST104b Statistics 2 and ST3133 Advanced
statistics: distribution theory). Pivotal functions feature heavily in Chapter 4 (Interval
estimation), and likelihood ratio tests similarly feature heavily in Chapter 5 (Hypothesis testing).
1
ST3134 Advanced statistics: statistical inference
Key steps to improvement
• An examination paper attempts to cover as much of the syllabus as possible, but inevitably
some parts are left out. As the included parts will vary from year to year it is not enough to
practise only on the past papers. Candidates should practise on all the examples, Learning
activities and Sample examination questions in the subject guide in order to understand the
relevant theory and adequately prepare for the examination. Afterwards you may try one or
two past papers for further practice.
• The examination paper for this course involves four questions and each of the questions
contains several parts which contribute towards the final grade with the different numbers of
marks reflecting their difficulty. The difficulty of each part is usually assessed after taking
into account the level of thinking related to statistical inference and the extent of algebraic
computations required. Take 5–10 minutes to read the whole paper and try to identify the
difficulties of each question in comparison to your ability. You will find that some questions
could be much easier for you than others.
• If you are unable to solve an exercise due to algebraic computations or lack of knowledge of
the material of ST3133 Advanced statistics: distribution theory, do not abandon an
exercise. Write clearly the procedure which should be followed for the relevant statistical
inference tasks. Remember that the primary aspect you are being examined on is the ability
to understand the statistical methodology.
• A good knowledge of basic calculus, such as the ability to compute integrals, sums and
derivatives, will give you invaluable help for completing the examination paper. Also, the
main concepts of ST3133 Advanced statistics: distribution theory (random variables
and their distribution, expectation, independence, conditional expectation etc.) should be
well-understood before attempting this half course. If you are not comfortable with any of
these, go back and revise. Memorising basic properties such as the moments and probability
mass/density functions of some standard distributions (such as the binomial, Poisson,
uniform, normal and gamma) is also a good idea.
• When you are trying to prove an equality or inequality be careful of what is on the left-hand
and the right-hand side. Justify the steps in the calculations as much as possible. If you
cannot get a question right, do not just write in the correct answer without any further
explanation. If you just write on the script what you think went wrong you will get more
marks. Also try to avoid writing more than you are asked to. For example, when asked to
state a theorem, do not give the proof as well as this simply wastes time.
• Be clear in your answers. When giving the probability mass/density function of a random
variable you should also state its range. When you are deriving an estimator or a sufficient
statistic also state the parameter to which it corresponds. Sometimes it is a good idea to
underline your final result (if applicable).
• Make sure that you are able to provide the definitions of basic concepts such as
estimators/estimates, confidence intervals and hypothesis tests. The above are statements
about population parameters which are derived using functions of the data (that is,
statistics). In the relevant calculations you should always be able to distinguish a statistic
from a parameter and be clear on what they represent.
• Practise on operations involving products and sums. A substantial number of candidates fail
to write them down correctly. Quite often they omit or write the indices incorrectly which
inevitably leads to errors. A typical example of this is the following:
Assume that a random sample is observed which consists of the random variables
Y1, Y2, . . . , Yn which come from a population with density fYi(yi; θ), often written for
2
Examiners’ commentaries 2018
simplicity as fY (y; θ). The fact that this sample is a random sample indicates that we can
write the joint density of Y1, . . . , Yn as:
fY1,...,Yn(y1, . . . , yn; θ) =
n∏
i=1
fYi(yi; θ).
Make sure that you write the above and not the following:
fY1,...,Yn(y1, . . . , yn; θ) =
n∏
i=1
fY (y; θ)
which sometimes leads to the incorrect expression:
fY1,...,Yn(y1, . . . , yn; θ) = fY (y; θ)
n.
Note that if you were asked to provide the log-likelihood function l(y1, . . . , yn; θ) the
calculations would have led you to sums rather than products (refresh your memory on the
properties of log functions). The correct expression would be:
l(y1, . . . , yn; θ) = log (fY1,...,Yn(y1, . . . , yn; θ))
= log
(
n∏
i=1
fYi(yi; θ)
)
=
n∑
i=1
log fYi(yi; θ).
Examination revision strategy
Many candidates are disappointed to find that their examination performance is poorer than they
expected. This may be due to a number of reasons, but one particular failing is ‘question
spotting’, that is, confining your examination preparation to a few questions and/or topics which
have come up in past papers for the course. This can have serious consequences.
We recognise that candidates might not cover all topics in the syllabus in the same depth, but you
need to be aware that examiners are free to set questions on any aspect of the syllabus. This
means that you need to study enough of the syllabus to enable you to answer the required number of
examination questions.
The syllabus can be found in the Course information sheet available on the VLE. You should read
the syllabus carefully and ensure that you cover sufficient material in preparation for the
examination. Examiners will vary the topics and questions from year to year and may well set
questions that have not appeared in past papers. Examination papers may legitimately include
questions on any topic in the syllabus. So, although past papers can be helpful during your revision,
you cannot assume that topics or specific questions that have come up in past examinations will
occur again.
If you rely on a question-spotting strategy, it is likely you will find yourself in difficulties
when you sit the examination. We strongly advise you not to adopt this strategy.
3
ST3134 Advanced statistics: statistical inference
Examiners’ commentaries 2018
ST3134 Advanced statistics: statistical inference
Important note
This commentary reflects the examination and assessment arrangements for this course in the
academic year 2017–18. The format and structure of the examination may change in future years,
and any such changes will be publicised on the virtual learning environment (VLE).
Information about the subject guide and the essential reading
references
Unless otherwise stated, all cross-references will be to the latest version of the subject guide (2011).
You should always attempt to use the most recent edition of any Essential reading textbook, even if
the commentary and/or online reading list and/or subject guide refer to an earlier edition. If
different editions of Essential reading are listed, please check the VLE for reading supplements – if
none are available, please use the contents list and index of the new edition to find the relevant
section.
Comments on specific questions – Zones A and B
Candidates should answer all FOUR questions: Question 1 of Section A (40 marks) and all
THREE questions from Section B (60 marks in total).
Section A
Answer question 1 from this section.
Question 1
(a) Let X = {X1, X2, . . . , Xn} be a random sample from a smooth probability
density function f(x | θ), where θ is an unknown parameter.
i. Let s(θ |X) denote the score function. Show that E[s(θ |X)] = 0. (You may
assume sufficient regularity conditions to allow differentiation under integral
signs.)
(5 marks)
ii. Let I(θ) denote the Fisher information. Show that:
I(θ) = −E
[
∂
∂θ
s(θ |X)
]
.
(5 marks)
Reading for this question
Section 2.4 of the subject guide.
4
Examiners’ commentaries 2018
Approaching the question
i. We have:
E(s(θ |X)) =
∫
Rn
s(θ |X) fX(x | θ) dx
=
∫
Rn
∂f(θ |X)/∂θ
fX(x | θ) fX(x | θ) dx
=
∫
Rn
∂
∂θ
fX(x | θ) dx
=
∂
∂θ
∫
Rn
fX(x | θ) dx (brief comment)
=
∂
∂θ
1
= 0.
ii. We have:
0 =
d
dθ
E [s(θ |X)]
=
d
dθ
∫
Rn
s(θ |x) f(x | θ) dx
=
∫
Rn
d
dθ
[s(θ |x) f(x | θ)] dx
=
∫
Rn
[(
d
dθ
s(θ |x)
)
f(x | θ) + s(θ |x) d
dθ
f(x | θ)
]
dx
=
∫
Rn
[
d
dθ
s(θ |x) + s(θ |x)2
]
f(x | θ) dx
(
d
dθ
f(x | θ) = s(θ |x) f(x | θ)
)
= E
[
d
dθ
s(θ |X) + s(θ |X)2
]
= E
[
d
dθ
s(θ |X)
]
+ E
[
s(θ |X)2] .
Hence shown.
(b) Let {X1, . . . , Xn} be a random sample from a Geometric(θ) distribution. Find
Fisher’s information for θ.
Hint: You may use the fact that if X ∼ Geometric(θ), then E(X) = 1/θ.
(10 marks)
Reading for this question
Section 2.4 of the subject guide.
Approaching the question
We have:
f(x | θ) =
n∏
i=1
(1− θ)xi−1θ = (1− θ)−n+
∑
i xiθn
and:
l(x | θ) = log f(x | θ) = n log(θ) + (−n+
∑
i
xi) log(1− θ).
5
ST3134 Advanced statistics: statistical inference
Hence:
∂
∂θ
l(x | θ) = n
θ
− −n+
∑
i xi
1− θ
and:
∂2
∂θ2
l(x | θ) = − n
θ2
− −n+
∑
i xi
(1− θ)2 .
Fisher information is:
I(θ) = −E
[
∂2
∂θ2
l(x | θ)
]
=
n
θ2
+
−n+∑i E(Xi)
(1− θ)2 =
n
θ2
+
n(1− θ)/θ
(1− θ)2 =
n
θ2(1− θ) .
(c) i. Provide the definition of a sufficient statistic.
(5 marks)
ii. State the factorisation theorem.
(5 marks)
Reading for this question
Section 2.3 of the subject guide.
Approaching the question
i. Suppose that Y = (Y1, . . . , Yn)
T is a sample. A statistic U = h(Y ) is a sufficient statistic
for a parameter θ if the conditional distribution of Y given U does not depend on θ.
ii. Let Y = (Y1, . . . , Yn)
T be a sample with joint probability density or mass function
fY (y; θ). The statistic U = h(Y ) is sufficient for the parameter θ if and only if we can
find functions b and c such that:
fY (y; θ) = b(h(y), θ) c(y)
for all y ∈ Rn and θ ∈ Θ.
(d) Let {X1, . . . , Xn} be a random sample from the Pareto distribution with
probability density function:
f(x;x0, α) =
{
αxα0 x
−α−1 for x ≥ x0
0 otherwise.
i. If x0 is known and α > 0 is unknown, find a sufficient statistic for α.
(5 marks)
ii. If α is known and x0 is unknown, find a sufficient statistic for x0.
(5 marks)
Reading for this question
Section 2.3 of the subject guide.
Approaching the question
For all xi ≥ x0, the joint pdf is:
n∏
i=1
f(xi;x0, α) =
n∏
i=1
αxα0x
−α−1
i = α
nxαn0
(
n∏
i=1
xi
)−α−1
.
i. If x0 is known, α is the parameter and the joint pdf has the form u(x)v(r(x), α), with
u(x) = 1 if all xi ≥ x0 and 0 if not, r(x) =
n∏
i=1
xi and v[t, α] = α
nxαn0 /t
α+1. So
n∏
i=1
Xi is
a sufficient statistic for α.
ii. If α is known, x0 is the parameter and the joint pdf has the form u(x)v(r(x), x0), with
u(x) = αn/
[
n∏
i=1
xi
]α+1
, r(x) = min{x1, . . . , xn} and v[t, x0] = 1 if t ≥ x0 and 0 if not.
Hence min{X1, . . . , Xn} is a sufficient statistic for x0.
6
Examiners’ commentaries 2018
Section B
Answer all three questions from this section.
Question 2
Let {X1, X2, . . . , Xn} be a random sample drawn from a distribution with
probability density function for each Xi, for i = 1, . . . , n, given by:
f(x |λ) = 1
λ
exp
(
−x
λ
)
, for x > 0, λ > 0
and 0 otherwise.
It is known that E(Xi) = λ, Var(Xi) = λ
2 and F (x |λ) = 1− exp (−x/λ).
(a) Find the distribution of the random variable T = min(X1, . . . , Xn).
(5 marks)
(b) Use the distribution of T to derive an unbiased estimator of λ. Calculate the
variance of the estimator.
(5 marks)
(c) Find the method of moments estimator of λ and check whether it is an unbiased
estimator of λ.
(5 marks)
(d) Consider the unbiased estimator in (b) and the method of moments estimator in
(c) and compare the variances of the two estimators. Which one is better?
Justify your answer.
(5 marks)
Reading for this question
Section 3.3 of the subject guide.
Approaching the question
(a) The probability density function of the ith order statistic, X(i), is given by:
fX(i)(x) =
n!
(i− 1)! (n− i)!F (x)
i−1f(x)[1− F (x)]n−i.
We are interested in the sample minimum so T = X(1). Substituting i = 1 in the above we
get fT (t) = n[1− F (t)]n−1f(t). We know (or can calculate) that F (t) = 1− exp(−t/λ), so
for t > 0 we have:
fT (t) = n
[
1−
(
1− exp
(
− t
λ
))]n−1
1
λ
exp
(
− t
λ
)
=
n
λ
exp
(
−nt
λ
)
.
The distribution of T is an exponential with mean λ/n.
(b) We know that E(T ) = λ/n. Hence E(nT ) = nλ/n = λ which means that nT is an unbiased
estimator of λ. For the variance of nT we can use the fact that Var(T ) = λ2/n2, to get:
var(nT ) = n2 var(T ) = n2
λ2
n2
= λ2.
(c) The first population moment is E(Xi) = λ. The first sample moment is X¯. The method of
moments estimator is obtained by setting:
λ̂MM = X¯.
We have E(X¯) = E(Xi) = λ. Hence the method of moments estimator is an unbiased
estimator of λ.
7
ST3134 Advanced statistics: statistical inference
(d) The variance of the estimator nT was found to be λ. The variance of the method of
moments estimator is:
var(X¯) =
var (Xi)
n
=
λ2
n
.
Since both of the estimators are unbiased we will choose the one with the smaller variance,
i.e. the method of moments estimator.
Question 3
Let {X1, . . . , Xn} be a random sample drawn from a distribution with probability
density function:
f(x; θ) =
{
2xθe−θx
2
for x > 0
0 otherwise
where θ > 0 is an unknown parameter.
(a) Find the maximum likelihood estimator of θ. You should check that the second
derivative of the log-likelihood is negative.
(6 marks)
(b) Use derivatives of the log-likelihood function to find the Fisher information I(θ).
(4 marks)
(c) Write down the asymptotic distribution of θ̂ and show that the limiting
distribution of:
√
n
(
θ̂
θ
− 1
)
is N(0, 1).
(5 marks)
(d) Use the results in (c) to construct an approximate 95% confidence interval for θ.
Hint: It is known that if Z is a standard normal random variable, then
P (Z > 1.96) ≈ 0.025.
(5 marks)
Reading for this question
Section 3.6 of the subject guide.
Approaching the question
(a) The likelihood function is:
L(θ) =
n∏
i=1
2Xiθe
−θX2i = 2nθne
−θ
n∑
i=1
X2i
n∏
i=1
Xi.
The log-likelihood function is:
l(θ) ∝ n log θ − θ
n∑
i=1
X2i .
Differentiating and equating to zero, we have:
dl(θ)
dθ
=
n
θ̂
−
n∑
i=1
X2i = 0 ⇒ θ̂ =
n
n∑
i=1
X2i
.
8
Examiners’ commentaries 2018
Since:
d2l(θ)
dθ2
= − n
θ2
< 0
this is a maximum.
(b) We have:
d2l(θ)
dθ2
= − n
θ2
hence:
I(θ) = −E
(
− n
θ2
)
=
n
θ2
.
(c) As n→∞, the asymptotic distribution of a maximum likelihood estimator θ̂ is
N(θ, (I(θ))−1). Therefore:
θ̂ ∼ N
(
θ,
θ2
n
)
.
Transforming:
θ̂ − θ ∼ N
(
0,
θ2
n
)
⇒ √n(θ̂ − θ) ∼ N(0, θ2)
hence: √
n(θ̂ − θ)
θ
=
√
n
(
θ̂
θ
− 1
)
∼ N(0, 1)
which does not depend on θ.
(d) We have:
P
(
−1.96 ≤ √n
(
θ̂
θ
− 1
)
≤ 1.96
)
≈ 0.95.
Rearranging:
P
(
θ̂
1 + 1.96/
√
n
≤ θ ≤ θ̂
1− 1.96/√n
)
≈ 0.95
hence an approximate 95% confidence interval for θ is:(
θ̂
1 + 1.96/
√
n
,
θ̂
1− 1.96/√n
)
.
For large n, 1− 1.96/√n will be positive.
Question 4
(a) State the Neyman–Pearson lemma.
(7 marks)
Reading for this question
Section 5.2 of the subject guide.
Approaching the question
Theorem (Neyman–Pearson lemma): Consider a test H0 : θ = θ0 vs. H1 : θ = θ1, with
rejection region R which satisfies:
y ∈ R if f(y | θ1)
f(y | θ0) > k
for some k > 0 and α = Pθ0(Y ∈ R).
A test which satisfies the above is a most powerful test of size α.
Note that f(y | θ0) and f(y | θ1) denote the pdf (pmf) (likelihood) of the sample Y . The
value k is a constant which depends on α.
9
ST3134 Advanced statistics: statistical inference
(b) Let Y be a random variable with a Binomial(12, pi) distribution. In other words:
P (Y = y |pi) =
(
12
y
)
piy (1− pi)12−y for y = 0, 1, . . . , 12
and 0 otherwise.
i. Consider the test of H0 : pi = 0.5 vs. H1 : pi > 0.5. What values of Y provide
evidence against H0? Find the p-value if y = 9. Would you reject H0 at the
5% significance level?
(5 marks)
ii. Now consider the test of H0 : pi = 0.5 vs. H1 : pi 6= 0.5. Derive the likelihood
ratio test statistic for this hypothesis.
(5 marks)
iii. Describe how you would construct an asymptotic test based on the likelihood
ratio test statistic. Comment on the suitability of this test.
(3 marks)
Reading for this question
Section 5.3 of the subject guide.
Approaching the question
i. Large values of Y provide evidence against H0. Hence the p-value if y = 9 is:
P (Y ≥ 9 |pi = 0.5) =
12∑
y=9
(
12
y
)
(0.5)y(0.5)12−y = 0.073.
Since the p-value is larger than the 5% significance level we do not reject H0.
ii. In order to find the LR test statistic we first need to find the MLE pi. The likelihood is (c
is a constant wrt pi):
L(pi | y) = c piy(1− pi)12−y.
The log-likelihood is:
l(pi | y) = logL(pi | y) = log(c) + y log(pi) + (12− y)
n∑
i=1
log(1− pi).
The score function is:
s(pi | y) = ∂l(pi | y)
∂pi
=
y
pi
− 12− y
1− pi .
Setting s(pi | y) = 0 provides pi = y/12 as a candidate MLE. Since:
∂2l(pi | y)
∂pi2
= − y
pi2
− 12− y
(1− pi)2 < 0
we conclude that this is indeed the MLE.
The likelihood ratio statistic LR is the following:
LR =
LY (pi)
LY (0.5)
=
piy(1− pi)12−y
(0.5)12
.
iii. For large n we have that the distribution of 2 logLR is χ21 and since large values of
2 logLR are against H0 a size α test will reject H0 if:
2 logLR > χ2α, 1.
Candidates are also expected to comment on the fact that the sample may not be large
enough so that χ21 is a good approximation for the distribution of 2 logLR. This last
point is unseen by candidates.
10