MAST90082-90082代写|学霸联盟

MAST90082-90082代写

时间：2023-04-04

MAST 90082 Practice Questions Solutions
1. Let X1, . . . , Xn be a random sample from a population with E(X1) = µ and Var(X1) =
σ2 <∞. Define X¯n = 1n
∑n
i=1Xi and S
2
n =
1
n−1
∑n
i=1(Xi − X¯n)2.
(a) Show that S2n is unbiased for σ
2, that is, E(S2n) = σ
2.
(b) Is Sn an unbiased estimator of σ? Justify your answer.
(c) IfX1, . . . , Xn
i.i.d.∼ N(µ, σ2), as shown in the lecture notes, E(Sn) =
√
2Γ(n/2)√
n−1Γ((n−1)/2)σ.
Show that E(Sn) → σ as n → ∞. Hint: you may consider using the Stirling’s
formula: Γ(x + 1) ∼ √2pix(x/e)x when x → ∞, where the symbol ∼ means
asymptotic convergence; the ratio of the two sides converges to 1 when x→∞.
Solution. (a) By simple algebra
n− 1
n
S2n
=
1
n
n∑
i=1
{X2i − 2XiX¯n + (X¯n)2}
=
1
n
n∑
i=1
X2i −
1
n
n∑
i=1
2XiX¯n +
1
n
n∑
i=1
(X¯n)
2
=
1
n
n∑
i=1
X2i − 2X¯n
(
1
n
n∑
i=1
Xi
)
+ (X¯n)
2
=
1
n
n∑
i=1
X2i − 2(X¯n)2 + (X¯n)2
=
1
n
n∑
i=1
X2i − (X¯n)2.
Thus, we obtain that
E(S2n) =
n
n− 1
[
E
(
1
n
n∑
i=1
X2i
)
− E{(X¯n)2}
]
=
n
n− 1
[
E(X2i )−
{
Var(X¯n) + (EX¯n)
2
}]
=
n
n− 1
{
(σ2 + µ2)−
(
σ2
n
+ µ2
)}
=
n
n− 1
n− 1
n
σ2
= σ2.
1
(b) By Cauchy-Schwarz inequality (|E(XY )|2 ≤ E(X2)E(Y 2) with X = Sn and Y =
1),
|E(Sn)|2 ≤ E(S2n) = σ2,
with equality holds when P(Sn = c) = 1, where c is a real constant. However,
when n > 1 and the population is not a point mass, P(Sn = c) = 1 does not hold.
Thus, |E(Sn)| < σ, which indicates that Sn is not an unbiased estimator of σ.
(c) By Stirling’s formula, as n→∞,
√
2Γ(n/2)√
n− 1Γ((n− 1)/2)
∼
√
2
√
2pi(n/2− 1)(n/2− 1)n/2−1e−(n/2−1)√
n− 1
√
2pi(n/2− 3/2)(n/2− 3/2)n/2−3/2e−(n/2−3/2)
=
√
n− 2
n− 1e
−1/2
(
n− 2
n− 3
)n/2−1
=
√
n− 2
n− 1e
−1/2
(
1 +
1/2
n/2− 3/2
)n/2−3/2(
n− 2
n− 3
)1/2
∼
√
n− 2
n− 1e
−1/2e1/2
∼ 1,
where the second last step uses ex = limn→∞
(
1 + x
n
)x
for any x ∈ R. Thus,
E(Sn) =
√
2Γ(n/2)√
n−1Γ((n−1)/2)σ → σ as n→∞, which indicates that Sn is asymptotically
unbiased for σ.
2. Let x1, . . . , xn be an observed sample, where n is the sample size. Find the value of θ
that minimizes
n∑
i=1
|xi − θ|.
Solution. (i) When n is odd, the value of θ that minimizes
∑n
i=1 |xi− θ| is the sample
median. That is, suppose n = 2M + 1 for some positive integer M , then X(M+1) is the
solution.
(ii) When n is even, any value between the middle two order statistics minimizes∑n
i=1 |xi − θ|. Suppose n = 2M for some positive integer M , then any value in
[X(M), X(M+1)] minimizes
∑n
i=1 |xi−θ|, which indicates that the solution is not unique.
2
I only show the proof for the case when n is odd, the prove is similar when n is even.
When n is odd, there exists an integer M such that n = 2M + 1. Rearrange x1, . . . , xn
in the order such that x(1) ≤ · · ·x(M) ≤ x(M+1) ≤ x(M+2) ≤ · · ·x(n).
For any θ and j = 1, . . . ,M ,
∣∣x(j) − θ∣∣+ ∣∣x(n−j+1) − θ∣∣ ≥ ∣∣x(j) − x(n−j+1)∣∣ ,
and the equality holds if and only if x(j) ≤ θ ≤ x(n−j+1). Mentioned that
n∑
i=1
|xi − θ| =
n∑
i=1
|x(i) − θ|
=
M∑
j=1
{∣∣x(j) − θ∣∣+ ∣∣x(n−j+1) − θ∣∣}+ ∣∣x(M+1) − θ∣∣
≥
M∑
j=1
∣∣x(j) − x(n−j+1)∣∣+ ∣∣x(M+1) − θ∣∣
and the equality holds if x(j) ≤ θ ≤ x(n−j+1) for all j = 1, . . . ,M , which is equivalent to
that x(M) ≤ θ ≤ x(M+2). For the last term
∣∣x(M+1) − θ∣∣, it equals 0 when θ = x(M+1).
As x(M) ≤ x(M+1) ≤ x(M+2),
n∑
i=1
|xi − θ| ≥
M∑
j=1
∣∣x(j) − x(n−j+1)∣∣
and the equality holds if and only if θ = x(M+1). Thus, x(M+1) is the only value of θ
that minimizes
∑n
i=1 |xi − θ|.
3. (Question 7.1 of Casella & Berger) One observation is taken on a discrete random
variable X with pmf f(x|θ), where θ ∈ {1, 2, 3}. Find the MLE of θ.
x f(x|1) f(x|2) f(x|3)
0 1/3 1/4 0
1 1/3 1/4 0
2 0 1/4 1/4
3 1/6 1/4 1/2
4 1/6 0 1/4
3
Solution. The likelihood function is maximized at θ = 1, 1, 2 (or 3), 3, 3 when x =
0, 1, 2, 3, 4, respectively. Thus, the MLE of θ is
θˆ =

1, if x = 0;
1, if x = 1;
2 (or 3), if x = 2;
3, if x = 3;
3, if x = 4.
4. (Question 7.7 of Casella & Berger) Let X1, . . . , Xn be i.i.d. with one of two pdfs. If
θ = 0, then
f(x|θ) =
{
1 if 0 < x < 1,
0 otherwise;
while if θ = 1, then
f(x|θ) =
{
1/(2x1/2) if 0 < x < 1,
0 otherwise.
Find the MLE of θ.
Solution. The likelihood function is defined as
L(0 | x1, . . . , xn) =
n∏
i=1
I(0 < xi < 1) = I(0 < x(1) ≤ x(n) < 1) and
L(1 | x1, . . . , xn) =
n∏
i=1
1
2x
1/2
i
I(0 < xi < 1) = 2−n
(
n∏
i=1
xi
)−1/2
I(0 < x(1) ≤ x(n) < 1).
Thus, the MLE of θ is θˆ = 0 if 1 ≥ 2−n (∏ni=1 xi)−1/2, and the MLE is θˆ = 1 if
1 < 2−n (
∏n
i=1 xi)
−1/2
.
5. (Question 7.12 of Casella & Berger) Let X1, . . . , Xn be a random sample from a pop-
ulation with pmf
Pr(X = x|θ) = θx(1− θ)1−x, x = 0, 1; 0 ≤ θ ≤ 1/2.
4
(a) Find the MME and MLE of θ.
(b) Find the MSEs of each of the estimators.
(c) Which estimator is preferred? Justify your choice.
Solution. (a) First, as E(X1) = θ, by solving µ1 = θ = X¯n = m1, we get the MME
of θ as θ˜ = X¯n.
Second, the likelihood function is
L(θ) =
n∏
i=1
P(Xi = xi | θ) = θ
∑n
i=1 xi(1− θ)n−
∑n
i=1 xi .
For the log-likelihood function
logL(θ) =
(
n∑
i=1
xi
)
log θ +
(
n−
n∑
i=1
xi
)
log(1− θ),
its first derivative is
∂ logL(θ)
∂θ
=
∑n
i=1 xi
θ
− n−
∑n
i=1 xi
1− θ ,
which is non-negative when θ ≤ X¯n and is negative when θ ≥ X¯n.
Thus, logL(θ) (and equivalently, L(θ)) is increasing for θ ≤ X¯n and is decreasing
when θ ≥ X¯n.
Incorporating with the parameter space Θ = [0, 1/2], when x¯n ≤ 1/2, the MLE of
θ is θˆ = x¯n, as x¯n is the overall maximize of L(θ). In addition, when x¯n > 1/2,
L(θ) is an increasing function of θ on Θ = [0, 1/2], and it attains its maximum
at the upper bound if θ, which is 1/2. So the MLE of θ is θˆ = 1/2 in this case.
In summary, the MLE of θ is θˆ = min{X¯n, 1/2}.
(b) For the MME θ˜, as E(θ˜) = θ and Var(θ˜) = θ(1−θ)
n
, we know that θ˜ is unbiased and
MSE(θ˜) = Var(θ˜) =
θ(1− θ)
n
.
5
For the MLE θˆ, there is no simple formula for MSE(θˆ), but an expression is
MSE(θˆ) = E{(θˆ − θ)2} =
n∑
y=0
(θˆ − θ)2
(
n
y
)
θy(1− θ)n−y
=
[n/2]∑
y=0
(y/n− θ)2
(
n
y
)
θy(1− θ)n−y +
n∑
y=[n/2]+1
(1/2− θ)2
(
n
y
)
θy(1− θ)n−y,
where y =
∑n
i=1 xi, [n/2] = n/2 if n is even, and [n/2] = (n− 1)/2 if n is odd.
(c) Using the notations in (b),
MSE(θ˜) = E{(θˆ − θ)2} =
n∑
y=0
(y/n− θ)2
(
n
y
)
θy(1− θ)n−y,
therefore,
MSE(θ˜)−MSE(θˆ)
=
n∑
y=[n/2]+1
{
(y/n− θ)2 − (1/2− θ)2}(n
y
)
θy(1− θ)n−y
=
n∑
y=[n/2]+1
(y/n+ 1/2− 2θ)(y/n− 1/2)
(
n
y
)
θy(1− θ)n−y.
The facts that y/n > 1/2 and θ ≤ 1/2 imply that every term in the sum is positive.
Therefore,
MSE(θˆ) < MSE(θ˜) for every θ ∈ Θ = [0, 1/2].
(Note: MSE(θˆ) = MSE(θ˜) = 0 when θ = 0.)
6. (Question 7.19, 7.20 and 7.21 of Casella & Berger) Suppose that the random variables
Y1, . . . , Yn satisfy
Yi = βxi + εi, i = 1, . . . , n,
where x1, . . . , xn are fixed constants, and ε1, . . . , εn are i.i.d. N(0, σ
2), σ2 is unknown.
(a) Find a two-dimensional sufficient statistic for (β, σ2).
(b) Find the MLE of β, and show that it is an unbiased estimator of β.
(c) Find the distribution of the MLE of β.
(d) Show that
∑n
i=1 Yi/
∑n
i=1 xi is an unbiased estimator of β.
6
(e) Calculate the exact variance of
∑n
i=1 Yi/
∑n
i=1 xi and compare it to the variance
of the MLE.
(f) Show that {∑ni=1(Yi/xi)}/n is also an unbiased estimator of β.
(g) Calculate the exact variance of {∑ni=1(Yi/xi)}/n and compare it to the variance
of the previous two estimators.
Solution. (a) The joint distribution of Y1, . . . , Yn is
f(y1, . . . , yn | β, σ2)
=
n∏
i=1
f(yi | β, σ2)
=
n∏
i=1
1√
2piσ2
exp
{
−(y1 − βxi)
2
2σ2
}
= (2pi)−n/2(σ2)−n/2 exp
{
− 1
2σ2
n∑
i=1
(
y2i − 2βxiyi + β2x2i
)}
= (2pi)−n/2(σ2)−n/2 exp
(
−β
2
∑n
i=1 x
2
i
2σ2
)
exp
{
− 1
2σ2
n∑
i=1
y2i +
β
σ2
n∑
i=1
xiyi
}
.
By Factorization Theorem, (
∑n
i=1 Y
2
i ,
∑n
i=1 xiYi) is a sufficient statistic for (β, σ
2).
(b) The likelihood function is
L(β, σ2) = f(y1, . . . , yn | β, σ2)
= (2pi)−n/2(σ2)−n/2 exp
{
− 1
2σ2
n∑
i=1
y2i +
β
σ2
n∑
i=1
xiyi − β
2
2σ2
n∑
i=1
x2i
}
.
Then the log-likelihood function is
logL(β, σ2) = −n
2
log(2pi)− n
2
log(σ2)− 1
2σ2
n∑
i=1
y2i +
β
σ2
n∑
i=1
xiyi − β
2
2σ2
n∑
i=1
x2i .
For a fixed σ2 (we are considering the MLE of β here), the likelihood equation is
∂ logL(β, σ2)
∂β
=
1
σ2
n∑
i=1
xiyi − β
σ2
n∑
i=1
x2i = 0,
7
and the solution is βˆ =
∑n
i=1 xiyi∑n
i=1 x
2
i
. Also,
∂2 logL(β, σ2)
∂β2
= − 1
σ2
n∑
i=1
x2i < 0,
so the MLE of β is
βˆ =
∑n
i=1 xiYi∑n
i=1 x
2
i
.
In addition,
E(βˆ) =
∑n
i=1 xiE(Yi)∑n
i=1 x
2
i
=
∑n
i=1 xi(βxi)∑n
i=1 x
2
i
= β,
which indicates that βˆ is unbiased for β.
(c) As βˆ =
∑n
i=1 xiYi∑n
i=1 x
2
i
=
∑n
i=1
(
xi∑n
j=1 x
2
j
)
Yi, which indicates that βˆ is a weighted sum-
mation of Y1, . . . , Yn. Thus, βˆ is normally distributed with mean β, and variance
Var(βˆ) =
n∑
i=1
(
xi∑n
j=1 x
2
j
)2
Var(Yi) =
n∑
i=1
x2i(∑n
j=1 x
2
j
)σ2 = σ2∑n
i=1 x
2
i
.
Thus,
βˆ ∼ N
(
β,
σ2∑n
i=1 x
2
i
)
.
(d) As
E
(∑n
i=1 Yi∑n
i=1 xi
)
=
∑n
i=1 E(Yi)∑n
i=1 xi
=
∑n
i=1 βxi∑n
i=1 xi
= β,
it is clear that
∑n
i=1 Yi∑n
i=1 xi
is an unbiased estimator of β.
(e) As
Var
(∑n
i=1 Yi∑n
i=1 xi
)
=
∑n
i=1 Var(Yi)
(
∑n
i=1 xi)
2
=
∑n
i=1 σ
2
(
∑n
i=1 xi)
2
=
nσ2
(
∑n
i=1 xi)
2
=
σ2
nx¯2n
,
where x¯n = n
−1∑n
i=1 xi.
As
∑n
i=1 x
2
i − nx¯2n =
∑n
i=1(xi − x¯n)2 ≥ 0, we have
∑n
i=1 x
2
i ≥ nx¯2n, hence
Var(βˆ) =
σ2∑n
i=1 x
2
i
≤ σ
2
nx¯2n
= Var
(∑n
i=1 Yi∑n
i=1 xi
)
.
8
(f) As
E
(∑n
i=1 Yi/xi
n
)
=
∑n
i=1 E(Yi)/xi
n
=
∑n
i=1 βxi/xi
n
= β,
it is clear that
∑n
i=1 Yi/xi
n
is an unbiased estimator of β.
(g) As
Var
(∑n
i=1 Yi/xi
n
)
=
∑n
i=1 Var(Yi)/x
2
i
n2
=
∑n
i=1 σ
2/x2i
n2
=
σ2
n2
(
n∑
i=1
1
x2i
)
.
Use the fact that the arithmetic mean is greater than or equal to the harmonic
mean (Example 4.7.8 of Casella & Berger), we have that
1
n
n∑
i=1
1
x2i
≥ n∑n
i=1 x
2
i
,
and it implies
Var(βˆ) =
σ2∑n
i=1 x
2
i
≤ σ
2
n2
(
n∑
i=1
1
x2i
)
= Var
(∑n
i=1 Yi/xi
n
)
.
In addition, as g(u) = 1/u2 is convex, using Jensen’s inequality (Theorem 4.7.7
of Casella & Berger),
1
x¯2n
≤ 1
n
n∑
i=1
(
1
x2i
)
,
and it implies
Var
(∑n
i=1 Yi∑n
i=1 xi
)
=
σ2
nx¯2n
≤ σ
2
n2
(
n∑
i=1
1
x2i
)
= Var
(∑n
i=1 Yi/xi
n
)
.
7. Let X1, . . . , Xn and Y1, . . . , Ym be independently distributed as N(µ, σ
2) and N(µ, τ 2),
respectively. Here σ2 > 0 and τ 2 > 0 are known. Find the MLE of µ.
Solution. The likelihood function of µ is the joint distribution of X1, . . . , Xn and
9
Y1, . . . , Yn, that is,
L(µ) = f(x1, . . . , xn; y1, . . . , ym | µ)
= f(x1, . . . , xn | µ)× f(y1, . . . , ym | µ)
=
{
n∏
i=1
f(xi | µ)
}
×
{
m∏
j=1
f(yj | µ)
}
=
n∏
i=1
1√
2piσ2
exp
{
−(xi − µ)
2
2σ2
}
×
m∏
j=1
1√
2piτ 2
exp
{
−(yj − µ)
2
2τ 2
}
= (2pi)−(n+m)/2(σ2)−n/2(τ 2)−m/2 exp
{
−
∑n
i=1(xi − µ)2
2σ2
−
∑m
j=1(yj − µ)2
2τ 2
}
.
The log-likelihood function is
logL(µ)
= −n+m
2
log(2pi)− n
2
log(σ2)− m
2
log(τ 2)−
∑n
i=1(xi − µ)2
2σ2
−
∑m
j=1(yj − µ)2
2τ 2
.
By taking partial derivative over µ, we can get the likelihood equation as
∂ logL(µ)
∂µ
=
∑n
i=1(xi − µ)
σ2
+
∑m
j=1(yj − µ)
τ 2
= 0,
and its solution is
µˆ =
τ 2
∑n
i=1 xi + σ
2
∑m
j=1 yj
τ 2n+ σ2m
=
x¯n/σ
2 + (m/n)(y¯m)/τ
2
1/σ2 + (m/n)(1/τ 2)
=
1/σ2
1/σ2 + (m/n)(1/τ 2)
x¯n +
1/τ 2
(n/m)(1/σ2) + 1/τ 2
y¯m,
which is a weighted average of x¯n and y¯m, where
x¯n = n
−1
n∑
i=1
xi and y¯m = m
−1
m∑
j=1
yj.
In addition, as
∂ logL(µ)
∂µ
= − n
σ2
− m
τ 2
< 0,
the MLE of µ is
µˆ =
x¯n/σ
2 + (m/n)(y¯m)/τ
2
1/σ2 + (m/n)(1/τ 2)
.
10
8. Let X1, . . . , Xn be a random sample from the uniform distribution on the interval
(θ, θ + |θ|). Find the MLE of θ when
(a) θ ∈ Θ = (0,∞);
(b) θ ∈ Θ = (−∞, 0).
Solution. (a) When θ > 0, θ + |θ| = 2θ, then the likelihood function is
L(θ) =
n∏
i=1
{
1
θ
I(θ < xi < 2θ)
}
=
1
θn
I(θ < x(1) ≤ x(n) < 2θ) = 1
θn
I(x(n)/2 < θ < x(1)).
As 1/θn is decreasing for θ ∈ (x(n)/2, x(1)). So L(θ) attains its maximum when
θ = x(n)/2. Thus, the MLE of θ is θˆ = X(n)/2.
(b) When θ < 0, θ + |θ| = 0, then the likelihood function is
L(θ) =
n∏
i=1
{
−1
θ
I(θ < xi < 0)
}
=
1
(−θ)n I(θ < x(1) ≤ x(n) < 0) =
1
(−θ)n I(θ < x(1)).
As 1/(−θ)n is increasing for θ ∈ (−∞, x(1)). So L(θ) attains its maximum when
θ = x(1). Thus, the MLE of θ is θˆ = X(1).
9. Let X1, . . . , Xn be a random sample from Uniform[0, θ], θ > 0. Let T2 = X(n) be the
maximum order statistic, and denote T3 =
n+1
n
T2. Find the MSE of T3 and compare it
to T2.
Solution. From Example 1.23 of the Lecture Notes, E(T2) =
n
n+1
θ, Var(T2) =
n
(n+1)2(n+2)
θ2,
and
MSE(T2) =
2
(n+ 1)(n+ 2)
θ2.
Thus, E(T3) =
n+1
n
E(T2) = θ and Var(T3) =
(n+1)2
n2
Var(T2) =
1
n(n+2)
θ2, from which we
get
MSE(T3) =
1
n(n+ 2)
θ2.
It is clear that MSE(T3) < MSE(T2) when n > 1.
11
10. Check that
1
n(n− 1)
∑
1≤i 6=j≤n
XiXj = (X¯n)
2 − S
2
n
n
,
where X¯n =
1
n
∑n
i=1 Xi and S
2
n =
1
n−1
∑n
i=1(Xi − X¯n)2.
Solution. According to the calculation in the solution of Question 1 (a), we have
(X¯n)
2 − S
2
n
n
= (X¯n)
2 − 1
n
n
n− 1
{
1
n
n∑
i=1
X2i −
1
n
(X¯n)
2
}
= (X¯n)
2 − 1
n(n− 1)
n∑
i=1
X2i −
1
n− 1(X¯n)
2
=
n
n− 1(X¯n)
2 − 1
n(n− 1)
n∑
i=1
X2i
=
n
n− 1
(
1
n
n∑
i=1
Xi
)2
− 1
n(n− 1)
n∑
i=1
X2i
=
1
n(n− 1)
n∑
i=1
n∑
j=1
XiXj − 1
n(n− 1)
n∑
i=1
X2i
=
{
1
n(n− 1)
∑
1≤i 6=j≤n
XiXj +
1
n(n− 1)
n∑
i=1
X2i
}
− 1
n(n− 1)
n∑
i=1
X2i
=
1
n(n− 1)
∑
1≤i 6=j≤n
XiXj.
11. Let X1, . . . , Xn be i.i.d. from N(µ, σ
2), µ ∈ R and σ2 > 0 are both unknown. Find the
CRLB for estimating E(X21 ) = µ
2 + σ2.
Solution. The likelihood function is
L(µ, σ2) = (2pi)−n/2(σ2)n/2 exp
{
−
∑n
i=1(xi − µ)2
2σ2
}
.
Then the log-likelihood function is
logL(µ, σ2) = −n
2
log(2pi)− n
2
log(σ2)−
∑n
i=1(xi − µ)2
2σ2
.
The first-order partial derivatives are
∂ logL(µ, σ2)
∂µ
=
∑n
i=1(xi − µ)
σ2
and
∂ logL(µ, σ2)
∂σ2
= − n
2σ2
+
∑n
i=1(xi − µ)2
2(σ2)2
.
12
Then the second-order partial derivatives are
∂2 logL(µ, σ2)
∂µ2
= − n
σ2
,
∂2 logL(µ, σ2)
∂(σ2)2
=
n
2(σ2)2
−
∑n
i=1(xi − µ)2
(σ2)3
∂2 logL(µ, σ2)
∂µ∂σ2
=
∂2 logL(µ, σ2)
∂σ2∂µ
= −
∑n
i=1(xi − µ)
(σ2)2
.
As E
{
−
∑n
i=1(xi−µ)2
(σ2)2
}
= 0 and E
{
n
2(σ2)2
−
∑n
i=1(xi−µ)2
(σ2)3
}
= − n
2σ4
, we obtain the Fisher
information matrix as
In(µ, σ2) =
(
n
σ2
0
0 n
2σ4
)
.
Finally, let γ(µ, σ2) = µ2 + σ2. By multivariate CRLB and the fact that ∂γ(µ,σ
2)
∂µ
= 2µ
and ∂γ(µ,σ
2)
∂σ2
= 1, we obtain the the CRLB for estimating E(X21 ) = µ
2 + σ2 is
(2µ, 1)
(
n
σ2
0
0 n
2σ4
)−1(
2µ
1
)
=
4µ2σ2
n
+
2σ4
n
.
12. Let X1, . . . , Xn be an i.i.d. sample from a population with pdf (or pmf) f(x|θ). Suppose
that the regularity conditions are satisfied. Denote I1(θ) = Eθ
[{
∂
∂θ
log f(X1|θ)
}2]
and
In(θ) = Eθ
[{
∂
∂θ
log f(X1, . . . , Xn|θ)
}2]
, where f(x1, . . . , xn|θ) is the joint pdf (or pmf)
of X1, . . . , Xn. Show that In(θ) = nI1(θ).
Solution. As X1, . . . , Xn are i.i.d. random variables, we have that f(x1, . . . , xn | θ) =∏n
i=1 f(xi | θ). It follows that
In(θ) = Eθ
[{
∂
∂θ
log f(X1, . . . , Xn|θ)
}2]
= Eθ
[
∂
∂θ
log
{
n∏
i=1
f(Xi | θ)
}]2
= Eθ
{
∂
∂θ
n∑
i=1
log f(Xi | θ)
}2
= Eθ
{
n∑
i=1
∂
∂θ
log f(Xi | θ)
}2
= Eθ
[
n∑
i=1
{
∂
∂θ
log f(Xi|θ)
}2]
+ Eθ
[ ∑
1≤i 6=j≤n
{
∂
∂θ
log f(Xi | θ)
}{
∂
∂θ
log f(Xi | θ)
}]
=
n∑
i=1
Eθ
[{
∂
∂θ
log f(Xi|θ)
}2]
+
∑
1≤i 6=j≤n
Eθ
[{
∂
∂θ
log f(Xi | θ)
}{
∂
∂θ
log f(Xi | θ)
}]
,
13
where the first term in the last equation is equal to nEθ
[{
∂
∂θ
log f(Xi|θ)
}2]
= nI1(θ)
as X1, . . . , Xn are i.i.d.. In addition, for any (i, j) such that 1 ≤ i 6= j ≤ n, Xi and
Xj are independent, so we have
Eθ
[{
∂
∂θ
log f(Xi | θ)
}{
∂
∂θ
log f(Xi | θ)
}]
= Eθ
{
∂
∂θ
log f(Xi | θ)
}
Eθ
{
∂
∂θ
log f(Xi | θ)
}
= 0.
Finally, In(θ) = nI1(θ).
13. Let X1, . . . , Xn be an i.i.d. sample from a population with pdf (or pmf) f(x|θ). Suppose
that the regularity conditions are satisfied. Show that
Eθ
[{
∂
∂θ
log f(X1|θ)
}2]
= −Eθ
{
∂2
∂θ2
log f(X1|θ)
}
.
Solution. First, by simple algebra,
Eθ
{
∂2
∂θ2
log f(X1|θ)
}
= Eθ
[
∂
∂θ
{
∂
∂θ
log f(X1|θ)
}]
= Eθ
[
∂
∂θ
{
∂
∂θ
f(X1|θ)
f(X1|θ)
}]
= Eθ
[
1
f(X1|θ)
{
∂2
∂θ2
f(X1|θ)
}
−
{
∂
∂θ
f(X1|θ)
}
∂
∂θ
{
1
f(X1|θ)
}]
= Eθ
[
1
f(X1|θ)
{
∂2
∂θ2
f(X1|θ)
}
−
{
∂
∂θ
f(X1|θ)
}{ ∂
∂θ
f(X1|θ)
{f(X1|θ)}2
}]
= Eθ
[
1
f(X1|θ)
{
∂2
∂θ2
f(X1|θ)
}]
− Eθ
[{
∂
∂θ
log f(X1|θ)
}2]
.
Now consider the first term:
Eθ
[
1
f(X1|θ)
{
∂2
∂θ2
f(X1|θ)
}]
=
∫
1
f(x1|θ)
{
∂2
∂θ2
f(x1|θ)
}
f(x1|θ)dx1
=
∫
∂2
∂θ2
f(x1|θ)dx1 = ∂
∂θ
{∫
∂
∂θ
f(x1|θ)dx1
}
=
∂
∂θ
[
∂
∂θ
{∫
f(x1|θ)dx1
}]
= 0,
and the identity is proved.
14
14. Complete the proof of Theorem 1.7 on Page 81 of the lecture notes.
Solution. Suppose T ′ is another UMVUE for γ(θ), then Var(T ′) = Var(T ). Consider
the estimator T ∗ = T+T
′
2
. It is clear that E(T ∗) = γ(θ) and thus an unbiased estimator
of γ(θ). In addition
Var(T ∗) =
1
4
Var(T ) +
1
4
Var(T ′) +
1
2
Cov(T, T ′)
≤ 1
4
Var(T ) +
1
4
Var(T ′) +
1
2
{Var(T )Var(T ′)}1/2 = Var(T ) (1)
for all θ ∈ Θ. As T is the UMVUE, we must have equality for all θ ∈ Θ. Since the
above inequality is an application of Cauchy-Schwarz inequality, we have the quality
only if T ′ = a(θ)T + b(θ). Now using the properties of covariance, we have
Cov(T, T ′) = Cov(T, a(θ)T + b(θ)) = a(θ)Var(T ),
but Cov(T, T ′) = Var(T ) since we had equality in (1). Hence a(θ) = 1 and, since
E(T ′) = γ(θ), we must have b(θ) = 0 and T = T ′, showing that T is unique.
15. If X be a random variable with pdf (or pmf) of the form
f(x|η) = c∗(η)h(x) exp
{
k∑
i=1
ηiti(x)
}
,
show that
Var {tj(X)} = − ∂
2
∂η2j
log c∗(η),
Cov {ti(X), tj(X)} = − ∂
2
∂ηi∂ηj
log c∗(η).
Solution. From the proof of Theorem 1.6 of the lecture notes, we know that c∗(η) =[∫
S h(x) exp
{∑k
i=1 ηiti(x)
}
dx
]−1
, c∗(η) ∂
∂ηj
∫
S h(x) exp
{∑k
i=1 ηiti(x)
}
dx = E {tj(X)},
15
and E {tj(X)} = − ∂∂ηj log c∗(η) = − 1c∗(η) ∂∂ηj c∗(η). By some algebra,
− ∂
2
∂η2j
log c∗(η) =
∂2
∂η2j
log
1
c∗(η)
=
∂
∂ηj
{
∂
∂ηj
log
1
c∗(η)
}
=
∂
∂ηj
[
∂
∂ηj
log
∫
S
h(x) exp
{
k∑
i=1
ηiti(x)
}
dx
]
=
∂
∂ηj
[
c∗(η)
∂
∂ηj
∫
S
h(x) exp
{
k∑
i=1
ηiti(x)
}
dx
]
=
{
∂
∂ηj
c∗(η)
}[
∂
∂ηj
∫
S
h(x) exp
{
k∑
i=1
ηiti(x)
}
dx
]
+c∗(η)
∂2
∂η2j
∫
S
h(x) exp
{
k∑
i=1
ηiti(x)
}
dx
=
1
c∗(η)
{
∂
∂ηj
c∗(η)
}[
c∗(η)
∂
∂ηj
∫
S
h(x) exp
{
k∑
i=1
ηiti(x)
}
dx
]
+c∗(η)
∫
S
h(x)
∂2
∂η2j
exp
{
k∑
i=1
ηiti(x)
}
dx
= −E {tj(X)}E {tj(X)}+
∫
S
c∗(η)h(x) exp
{
k∑
i=1
ηiti(x)
}
{tj(x)}2dx
= − [E {tj(X)}]2 + E {tj(X)}2 = Var {tj(X)} ,
which shows the first identity. The second identity can be proved similarly and thus is
omitted here.
16. Let X be a random variable having the Gamma distribution with shape parameter α
and scale parameter γ, where α is known and γ is unknown. Let Y = σ logX. Show
that
(a) If σ > 0 is unknown, then the distribution of Y is in a location-scale family.
(b) If σ > 0 is known, then the distribution of Y is in an exponential family.
Solution. The pdf of Y is given by
fY (y) =
1
σΓ(α)γα
eαy/σe−
1
γ
ey/σ =
1
σΓ(α)
eα(y−σ log γ)/σe−e
(y−σ log γ)/σ
.
(a) When σ > 0 is unknown, it is clear that the distribution of Y is in a location-scale
family with location parameter σ log γ and scale parameter σ.
16
(b) When σ > 0 is known, we can rewrite the pdf of Y as 1
σΓ(α)γα
eαy/σ exp
{−ey/σ× 1
γ
}
.
Therefore, the distribution of Y belongs to an exponential family with t(y) = −ey/σ
and ω(γ) = 1/γ.
17. Does {Uniform[a, b] : a ∈ R, b ∈ R, a ≤ b} belong to location family, scale family, or
location-scale family? Justify your answer.
Solution. Let us assume a < b, otherwise the distribution degenerate to a point mass.
The pdf of Uniform[a, b] is given by
f(x | a, b) = 1
b− aI(a ≤ x ≤ b) =
1
b− aI
(
0 ≤ x− a
b− a ≤ 1
)
.
Thus, {Uniform[a, b] : a ∈ R, b ∈ R, a < b} belongs to location-scale family with
location parameter a and scale parameter b− a.
18. Proof the following statement: If T is sufficient for θ, and γ is a one-to-one function,
then γ(T ) is also sufficient for θ.
Solution. First, by Factorization Theorem, there exist functions g and h, such that
f(~xn | θ) = g(T (~xn) | θ)h(~xn). As γ is a one-to-one function, there exists γ−1(·), the
inverse function of γ(·). Then we have
f(~xn | θ) = g(T (~xn) | θ)h(~xn) = g(γ−1(γ(T (~xn))) | θ)h(~xn) = g1(γ(T (~xn)) | θ)h(~xn),
where g1(t
′ | θ) = g(γ−1(t′) | θ). Thus, γ(T ) is also sufficient for θ.
19. Proof the following statement: Suppose T is a sufficient statistic for θ and T0 is another
arbitrary statistic, then T ′ = (T0, T ) is also sufficient for θ.
Solution. First, by Factorization Theorem, there exist functions g and h, such that
f(~xn | θ) = g(T (~xn) | θ)h(~xn). As T ′ = (T0, T ), T is a function of T ′, denote T =
α(T ′). then we have
f(~xn | θ) = g(T (~xn) | θ)h(~xn) = g(α(T ′(~xn)) | θ)h(~xn) = g1(T ′(~xn) | θ)h(~xn),
where g1(t
′ | θ) = g(α(t′) | θ). Thus, T ′ is also sufficient for θ.
17
20. (Question 6.2 of Casella & Berger) Let X1, . . . , Xn be independent random variables
with densities
fXi(x|θ) =
{
eiθ−x if x ≥ iθ,
0 otherwise.
Prove that T = mini(Xi/i) is a sufficient statistic for θ.
Solution. The joint pdf of X1, . . . , Xn is
f(x1, . . . , xn | θ)
=
n∏
i=1
eiθ−xiI(xi ≥ iθ)
=
n∏
i=1
eiθ−xiI(xi/i ≥ θ)
= e
∑n
i=1 iθ−
∑n
i=1 xiI(x1 ≥ θ, x2/2 ≥ θ, . . . , xn/n ≥ θ)
= en(n+1)θ/2I(min
i
(xi/i) ≥ θ)e−
∑n
i=1 xi .
By Factorization Theorem, T = mini(Xi/i) is a sufficient statistic for θ.
21. (Question 6.3 of Casella & Berger) Let X1, . . . , Xn be a random sample from the pdf
f(x|µ, σ) = σ−1e−(x−µ)/σ, x > µ, σ > 0.
Find a two-dimensional sufficient statistic for (µ, σ).
Solution. The joint pdf of X1, . . . , Xn is
f(x1, . . . , xn | µ, σ)
=
n∏
i=1
f(xi | µ, σ)
=
n∏
i=1
σ−1e−(xi−µ)/σI(xi > µ)
= σ−nenµ/σe−
∑n
i=1 xi/σI(x(1) > µ),
where x(1) = min{x1, . . . , xn}. By Factorization Theorem, (X(1),
∑n
i=1 Xi) is a suffi-
cient statistic for (µ, σ).
18
22. (Question 6.13 of Casella & Berger) Suppose X1 and X2 are i.i.d. observations from
the pdf f(x|α) = αxα−1e−xα , x > 0, α > 0. Show that (logX1)/(logX2) is an ancillary
statistic.
Solution. Let Y1 = logX1 and Y2 = logX2. Then Y1 and Y2 are i.i.d. and, the pdf of
each is
f(y | α)
= α exp{αy − eαy}
=
1
1/α
exp
{
y
1/α
− ey/(1/α)
}
for −∞ < y <∞.
We see that the family of distribution of Yi is a scale family with scale parameter 1/α.
Thus, we can write Yi =
1
α
Zi, where Z1 and Z2 are a random sample from f(z | 1).
Then
logX1
logX2
=
Y1
Y2
=
(1/α)Z1
(1/α)Z2
=
Z1
Z2
.
Because the distribution of Z1/Z2 does not depend on α, (logX1)/(logX2) is an ancil-
lary statistic.
23. (Question 6.17 of Casella & Berger) LetX1, . . . , Xn be i.i.d. with geometric distribution
Pθ(X = x) = θ(1− θ)x−1, x = 1, 2, . . . , 0 < θ < 1.
Show that T =
∑n
i=1Xi is sufficient for θ, and find the family of distributions of T . Is
the family complete?
Solution. The population pmf can be reformulated as
f(x | θ)
= θ(1− θ)x−1
=
θ
1− θe
{log(1−θ)}×x.
So it belongs to exponential family with t(x) = x and ω(θ) = log(1− θ). As {ω(θ) : θ ∈
(0, 1)} = (−∞, 0) is an open set in R, ∑ni=1Xi is complete and sufficient for θ.
19
24. (Question 6.20 of Casella & Berger) For each of the following pdfs, let X1, . . . , Xn be
i.i.d. observations. Find a complete and sufficient statistic, or show that one does not
exist.
(a) f(x|θ) = 2x/θ2, 0 < x < θ, θ > 0;
(b) f(x|θ) = θ/(1 + x)1+θ, 0 < x <∞, θ > 0;
(c) f(x|θ) = (log θ)θx/(θ − 1), 0 < x < 1, θ > 1;
(d) f(x|θ) = e−(x−θ) exp{−e−(x−θ)}, −∞ < x <∞, −∞ < θ <∞;
(e) f(x|θ) = (2
x
)
θx(1− θ)2−x, x = 0, 1, 2, 0 ≤ θ ≤ 1.
Solution. (a) We show that Y = maxiXi is sufficient and complete for θ. The joint
pdf of X1, . . . , Xn is
f(x1, . . . , xn | θ) = 2nθ−2n
n∏
i=1
xiI(0 < max
i
xi < θ).
By Factorization theorem, Y is sufficient for θ. In addition, the pdf of Y is given
by
fY (y) =
2n
θ2n
y2n−1, 0 < y < θ.
For a function g(y) such that
E{g(Y )} =
∫ θ
0
g(y)
2n
θ2n
y2n−1dy = 0
for all θ > 0. Taking derivatives we obtain g(θ) 2n
θ2n
θ2n−1 = 0 for all θ > 0. Thus,
P(g(Y ) = 0) = 1, so Y is complete.
(b) The pdf can be rewritten as
f(x | θ) = θ
(1 + x)1+θ
= θ exp {−(1 + θ) log(1 + x)} ,
which indicates that the distribution belongs to an exponential family with ω(θ) =
−(1 + θ) and t(x) = log(1 + x). In addition, {ω(θ) : θ > 0} = (−∞,−1) is an
open set in R. Thus, ∑ni=1 log(1 +Xi) is complete and sufficient for θ.
(c) The pdf can be rewritten as
f(x | θ) = (log θ)θx/(θ − 1) = log θ
θ − 1 exp {log θ × x} ,
20
which indicates that the distribution belongs to an exponential family with ω(θ) =
log θ and t(x) = x. In addition, {ω(θ) : θ > 1} = (0,∞) is an open set in R.
Thus,
∑n
i=1 Xi is complete and sufficient for θ.
(d) The pdf can be rewritten as
f(x | θ) = e−(x−θ) exp{−e−(x−θ)} = eθe−x exp{−eθ × e−x} ,
which indicates that the distribution belongs to an exponential family with ω(θ) =
−e−θ and t(x) = e−x. In addition, {ω(θ) : θ ∈ R} = (−∞, 0) is an open set in
R. Thus, ∑ni=1 e−Xi is complete and sufficient for θ.
(e) The pdf can be rewritten as
f(x | θ) =
(
2
x
)
θx(1− θ)2−x = (1− θ)2
(
2
x
)
exp
{
log
(
θ
1− θ
)
× x
}
,
which indicates that the distribution belongs to an exponential family with ω(θ) =
log
(
θ
1−θ
)
and t(x) = x. In addition, {ω(θ) : 0 ≤ θ ≤ 1} = [−∞,∞] contains an
open set in R. Thus, ∑ni=1 Xi is complete and sufficient for θ.
25. (Question 6.30 of Casella & Berger) Let X1, . . . , Xn be a random sample from the
population with pdf f(x|µ) = e−(x−µ), where −∞ < µ < x <∞.
(a) Show that X(1) = miniXi is a complete and sufficient statistic for µ.
(b) Use Basu’s Theorem to show that X(1) and S
2
n are independent, where S
2
n =
1
n−1
∑n
i=1(Xi − X¯n)2 is the sample variance.
Solution. (a) The joint pdf of X1, . . . , Xn is
f(x1, . . . , xn | µ) = enµ−
∑n
i=1 xiI(x(1) > µ).
By Factorization theorem, Y = X(1) = miniXi is sufficient for µ. In addition,
the pdf of Y is given by
fY (y) = ne
−n(y−µ), −∞ < µ < y < θ.
For a function g(y) such that
E{g(Y )} =
∫ ∞
µ
g(y)ne−n(y−µ)dy = 0
21
for all µ. This is equivalent to
∫∞
µ
g(y)ne−nydy = 0 for all µ. Taking derivatives
we obtain −g(θ)e−nµ = 0 for all µ. Thus, P(g(Y ) = 0) = 1, so Y is complete and
sufficient for µ.
(b) It is clear that f(x | µ) is a location family with location parameter µ. So we can
write Xi = µ+ Zi, where Z1, . . . , Zn are i.i.d. from f(x | 0). Then
S2n =
1
n− 1
n∑
i=1
(Xi−X¯n)2 = 1
n− 1
n∑
i=1
{(µ+Zi)−(µ+Z¯n)}2 = 1
n− 1
n∑
i=1
(Zi−Z¯n)2.
Because S2n is a function of only Z1, . . . , Zn, the distribution of S
2
n does not depend
on µ, that is, S2n is ancillary. Therefore, by Basu’s theorem and the fact that X(1)
is complete and sufficient for µ, X(1) and S
2
n are independent.
26. (Question 6.40 of Casella & Berger) Let X1, . . . , Xn be i.i.d. observations from a
location-scale family. Let T1(X1, . . . , Xn) and T2(X1, . . . , Xn) be two statistics that
both satisfy
Ti(ax1 + b, . . . , axn + b) = aTi(x1, . . . , xn), i = 1, 2
for all values of x1, . . . , xn and b and for any a > 0.
(a) Show that T1/T2 is an ancillary statistic.
(b) LetRn = X(n)−X(1) be the sample range and Sn be the sample standard deviation.
Verify that Rn and Sn satisfy the above condition so that Rn/Sn is an ancillary
statistic.
Solution. (a) Because X1, . . . , Xn is from a location-scale family (suppose µ is the
location parameter and σ is the scale parameter), we can write Xi = σZi + µ for
i = 1, . . . , n, where Z1, . . . , Zn is a random sample from the standard pdf that is
free to µ and σ. Then
T1
T2
=
T1(X1, . . . , Xn)
T2(X1, . . . , Xn)
=
T1(σZ1 + µ, . . . , σZn + µ)
T1(σZ1 + µ, . . . , σZn + µ)
=
σT1(Z1, . . . , Zn)
σT2(Z1, . . . , Zn)
=
T1(Z1, . . . , Zn)
T2(Z1, . . . , Zn)
.
Because T1/T2 is a function of only Z1, . . . , Zn, the distribution of T1/T2 does not
depend on µ or σ; that is, T1/T2 is an ancillary statistic.
22
(b) Let R(x1, . . . , xn) = x(n) − x(1). If a > 0, max{ax1 + b, . . . , axn + b} = ax(n) + b
and min{ax1 + b, . . . , axn + b} = ax(1) + b. Thus, R(ax1 + b, . . . , axn + b) =
(ax(n) +b)− (ax(1) +b) = a(x(n)−x(1)) = aR(x1, . . . , xn). For the sample variance
we have
S2(ax1 + b, . . . , axn + b) =
1
n− 1
n∑
i=1
{
(axi + b)− n−1
n∑
j=1
(axj + b)
}2
= a2
1
n− 1
n∑
i=1
(
xi − n−1
n∑
j=1
xj
)2
= a2S2(x1, . . . , xn).
Thus, S(ax1 + b, . . . , axn + b) = aS(x1, . . . , xn). Therefore, R and S both satisfy
the above condition, and Rn/Sn is an ancillary statistic.
27. Suppose X1, . . . , Xn is a random sample form the Exponential(θ), where θ > 0 is
unknown. Let T =
∑n
i=1Xi. Show that
(a) E
(
X(i)
T
)
=
E(X(i))
E(T )
.
(b) E(X(i) | T ) = E
(
X(i)
T
T | T
)
= T
E(X(i))
E(T )
.
Solution. We use the following two facts directly in the question: (1) T =
∑n
i=1 Xi is
complete and sufficient for θ as Exponential(θ) is an exponential family; (2) X(i)/T is
ancillary as Exponential(θ) is a scale family. Then by Basu’s theorem, T is independent
of X(i)/T .
(a) As E(X(i)) = E
(
T × X(i)
T
)
= E(T )× E(X(i)/T ), we have E
(
X(i)
T
)
=
E(X(i))
E(T )
.
(b) E(X(i) | T ) = E
(
X(i)
T
T | T
)
= TE
(
X(i)
T
| T
)
= TE
(
X(i)
T
)
= T
E(X(i))
E(T )
, where the
third equation is due to that T and X(i)/T are independent, and the last equation
is due to (a).
28. (Question 6.36 of Casella & Berger) One advantage of using a minimal sufficient statis-
tic is that unbiased estimators will have smaller variance. Suppose that T1 is a suffi-
cient statistic and T2 is minimal sufficient, U is an unbiased estimator of θ, and define
U1 = E(U | T1) and U2 = E(U | T2).
(a) Show that U2 = E(U1 | T2).
23
(b) Show that Var(U2) ≤ Var(U1).
Solution. (a) As T2 is minimal sufficient, T2 is a function of T1. Then E(U1 | T2) =
E(E(U | T1) | T2) = E(U | T2) = U2.
(b) By the conditional variance identity (Theorem 4.4.7 of Casella & Berger),
Var(U1) = E{Var(U1 | T2)}+ Var{E(U1 | T2)} = E{Var(U1 | T2)}+ Var(U2),
which implies Var(U2) ≤ Var(U1) immediately as E{Var(U1 | T2)} ≥ 0.
29. (Question 7.37 of Casella & Berger) Let X1, . . . , Xn be a random sample from the
Uniform distribution on the interval [−θ, θ], where θ > 0. Find the UMVUE of θ.
Solution. To find the UMVUE of θ, we find a complete and sufficient statistic first.
The joint pdf of X1, . . . , Xn is
f(xn | θ) = (2θ)−2 I
(
0 ≤ max
i
|xi| < θ
)
.
By the Factorization Theorem, maxi |Xi| is a sufficient statistic. To check that it is
a complete sufficient statistic, let Y = maxi |Xi|. Note that the pdf of Y is fY (y) =
nyn−1/θn for 0 < y < θ. Suppose g(y) is a function such that
E{g(Y )} =
∫ θ
0
g(y)nyn−1/θndy = 0
for all θ > 0. Taking derivatives on both sides shows that θn−1g(θ) = 0 for all θ > 0.
So g(θ) = 0 for all θ > 0, and P(g(Y ) = 0) = 1. Thus, Y = maxi |Xi| is a complete
and sufficient statistic. Now
E(Y ) =
∫ θ
0
ynyn−1/θndy =
n
n+ 1
θ,
which implies that E
(
n+1
n
Y
)
= θ. By Lehmann-Sche´ffe’s Theorem, n+1
n
maxi |Xi| is
the UMVUE of θ.
30. Let X1, . . . , Xn be a random sample from the Bernoulli(p) distribution. Find the
UMVUE of p4. Hint: consider T0 = X1X2X3X4 as the starting unbiased estimator.
24
Solution. Here, we show a more general case, that is, find the UMVUE
of pm, where m is a positive integer and m ≤ n.
Let T =
∑n
i=1 Xi. Then T is a complete and sufficient statistic for p.
Method 1: As
E
(
m∏
i=1
Xi
)
= pm,
∏m
i=1Xi is an unbiased estimator of γ(p) = p
m. By Rao-Blackwell Theorem, the
UMVUE of γ(p) = pm is E(
∏m
i=1Xi|T ). For t ≥ m,
E
(
m∏
i=1
Xi
∣∣∣∣T = t
)
= P
(
m∏
i=1
Xi = 1
∣∣∣∣T = t
)
(as
m∏
i=1
Xi can only be 0 or 1)
=P
(
X1 = 1, . . . , Xm = 1
∣∣∣∣ n∑
i=1
Xi = t
)
=
P (X1 = 1, . . . , Xm = 1,
∑n
i=1 Xi = t)
P(T = t)
=
P (X1 = 1, . . . , Xm = 1,
∑n
i=mXi = t−m)
P(T = t)
=
pm × (n−m
t−m
)
pt−m(1− p)n−t(
n
t
)
pt(1− p)n−t
=
(
n−m
t−m
)(
n
t
) .
In addition,
E
(
m∏
i=1
Xi
∣∣∣∣T = t
)
= 0
if t < m. So the UMVUE of γ(p) = pm is
{ (n−mT−m)
(nT)
, T = m, . . . , n,
0, T = 0, . . . ,m− 1.
Method 2: By Lehmann-Scheffe´ Theorem, if h is a real function satisfying E{h(T )} =
γ(p) = pm for all p ∈ (0, 1), then h(T ) is the UMVUE of γ(p). Mentioned that
E{h(T )} = pm
⇐⇒
n∑
k=0
(
n
k
)
h(k)pk(1− p)n−k = pm
⇐⇒
n∑
k=0
(
n
k
)
h(k)pk−m(1− p)(n−m)−(k−m) = 1.
25
If m < k, pk−m →∞ as p→ 0. Hence, we must have h(k) = 0 for k = 0, 1, . . . ,m−1.
Then
n∑
k=m
(
n
k
)
h(k)pk−m(1− p)(n−m)−(k−m) = 1
for all p ∈ (0, 1). On the other hand,
n∑
k=m
(
n−m
k −m
)
pk−m(1− p)(n−m)−(k−m) = 1
for all p ∈ (0, 1). Then (n
k
)
h(k) =
(
n−m
k−m
)
for k = m, . . . , n. So the UMVUE of
γ(p) = pm is
h(T ) =
{ (n−mT−m)
(nT)
, T = m, . . . , n,
0, T = 0, . . . ,m− 1.
31. Suppose that T is a UMVUE of an unknown parameter θ. Show that T k is a UMVUE
of E(T k), where k is any positive integer for which E(T 2k) <∞.
Solution. We use the following theorem to show this question:
Theorem 1.1. Let U be the set of all unbiased estimators of 0 with finite variance and
T is an unbiased estimator of θ with E(T 2) <∞. A necessary and sufficient condition
for an estimator T to be a UMVUE of θ is that E(TU) = 0 for any θ and any U ∈ U .
Let U be an unbiased estimator of 0. Since T is a UMVUE of θ, the E(TU) = 0 for
any θ, which mean that TU is an unbiased estimator of 0. Then
E(T 2U) = E{T (TU)} = 0
if E(T 4) < ∞. By the above theorem, T 2 is a UMVUE of E(T 2). Similarly, we can
show that T 3 is a UMVUE of E(T 3). By mathematical induction, we can show that T k
is a UMVUE of E(T k) provided that E(T 2k) <∞.
32. (Question 7.50 of Casella & Berger) Let X1, . . . , Xn be i.i.d. from N(θ, θ
2). Denote
T1 = X¯n and T2 = cSn with c =
√
n−1Γ((n−1)/2)√
2Γ(n/2)
and Sn be the sample standard deviation.
(a) Show that both T1 and T2 are unbiased estimators of θ.
26
(b) Prove that for any number a, the estimator aT1+(1−a)T2 is an unbiased estimator
of θ.
(c) Find the value of a that produces the estimator with minimum variance.
(d) Show that (X¯n, S
2
n) is a sufficient statistic for θ but it is not a complete statistic.
Solution. (a) First, E(X¯n) = θ, which indicates that T1 is unbiased for θ. As (n −
1)S2n/θ
2 ∼ χ2n−1, we havee
E(Sn) = E
[
θ√
n− 1
{
(n− 1)S2n
θ2
}1/2]
=
θ√
n− 1
∫ ∞
0
x1/2
1
2(n−1)/2Γ((n− 1)/2)x
(n−1)/2−1e−x/2dx
=
√
2Γ(n/2)√
n− 1Γ((n− 1)/2)θ,
and it implies that E(cSn) = θ with c =
√
n−1Γ((n−1)/2)√
2Γ(n/2)
. Thus, T2 is also unbiased
for θ.
(b) As
E{aT1 + (1− a)T2} = aE(T1) + (1− a)E(T2) = aθ + (1− a)θ = θ,
we have aT1 + (1− a)T2 is an unbiased estimator of θ.
(c) First, X¯n and S
2
n are independent for this normal model. In addition, Var(T1) =
θ2/n and Var(T2) = c
2E(S2n)− θ2 = c2θ2 − θ2 = (c2 − 1)θ2. Therefore,
Var{aT1 + (1−a)T2} = a2Var(T1) + (1−a)2Var(T2) = a2θ2/n+ (1−a)2(c2−1)θ2.
We can use calculus to show that this quadratic function of a is minimized at
a =
c2 − 1
c2 − 1 + 1/n.
(d) Similar to Question 9 of Assignment 1, solution will be provided later.
33. (Question 7.59 of Casella & Berger) Let X1, . . . , Xn be i.i.d. from N(θ, σ
2). Find the
UMVUE of σp, where p is a known positive constant, not necessary an integer.
Solution. First, we know T = (n− 1)S2n/σ2 ∼ χ2n−1, where S2n = (n− 1)−1
∑n
i=1(Xi−
27
X¯n)
2 is the sample variance. Then
E(T p/2) =
∫ ∞
0
xp/2
1
2(n−1)/2Γ((n− 1)/2)x
(n−1)/2−1e−x/2dx
=
1
2(n−1)/2Γ((n− 1)/2)
∫ ∞
0
x(p+n−1)/2−1e−x/2dx =
2p/2Γ
(
p+n−1
2
)
Γ
(
n−1
2
) .
Thus,
E
{
Γ
(
n−1
2
)
2p/2Γ
(
p+n−1
2
) ((n− 1)S2n)p/2
}
= σp.
As (X¯n, S
2
n) is a complete and sufficient statistic. By Lehmann-Sche´ffe’s Theorem,
Γ(n−12 )
2p/2Γ( p+n−12 )
(n− 1)p/2Spn is the UMVUE of σp.