r代写-MATH5905|学霸联盟

r代写-MATH5905

时间：2022-03-27

THE UNIVERSITY OF NEW SOUTH WALES
DEPARTMENT OF STATISTICS
MID SESSION TEST - 2021 - Wednesday, 31st March (Week 7)
Solutions
MATH5905
Time allowed: 135 minutes
1. Let X = (X1, X2, . . . , Xn) be a sample of i.i.d. Geometric(θ) random variables with
density function
f(x; θ) = θ(1− θ)x, x = {0, 1, 2, . . . }; θ ∈ (0, 1)),
with
E(X) =
1− θ
θ
and Var(X) =
1− θ
θ2
.
a) Provide justification why the statistic T (X) =
∑n
i=1Xi is complete and sufficient
for θ
The Geometric distribution belongs to the one-parameter exponential
family since
f(x, θ) = θ exp(x log(1− θ))
with d(x) = x and hence T (X) =
∑n
i=1Xi is complete and (minimal)
sufficient for θ.
b) Derive the UMVUE of h(θ) = θ. You must justify each step in your answer.
Hint: Use the interpretation that P(X1 = 0) = θ and the fact that
T =
n∑
i=1
Xi ∼ Negative Binomial
(
n, θ
)
with probability mass function
P (T = t) =
(
n+ t− 1
t
)
θn(1− θ)t, t = 0, 1, 2, . . .
1) T is sufficient and complete for θ.
2) Let W = I{X1=0}(X) and then using the fact that P (X1 = 0) = θ we
have
E(W ) = E
(
I{X1=0}(X)
)
= P (X1 = 0) = θ,
which is unbiased for h(θ) = θ.
1
3) Now, apply the Theorem of Lehmann-Scheffe, to obtain
τˆ(T ) = E(W |T = t)
= E
(
I{X1=0}(X)
∣∣∣ n∑
i=1
Xi = t
)
= P
(
X1 = 0
∣∣∣ n∑
i=1
Xi = t
)
=
P
(
X1 = 0,
∑n
i=1Xi = t
)
P
(∑n
i=1Xi = t
)
The events in the numerator must be satisfied simultaneously to
have a non-zero probability and hence this reduces to
τˆ(T ) =
P
(
X1 = 0,
∑n
i=2Xi = t
)
P
(∑n
i=1Xi = t
)
=
P (X1 = 0)P
(∑n
i=2Xi = t
)
P
(∑n
i=1Xi = t
)
=
θ
(
n−1+t−1
t
)
θn−1(1− θ)t(
n+t−1
t
)
θn(1− θ)t
=
(n− 2 + t)!
t!(n− 1 + t− 1− t)! ·
t!(n+ t− 1− t)!
(n+ t− 1)!
=
(n+ t− 2)!
(n− t− 1)! ·
(n− 1)!
(n− 2)!
=
n− 1
n− 1− t
Therefore the UMVUE for h(θ) = θ is:
̂humvue(θ) =
n− 1
n− 1 +
n∑
i=1
Xi
c) Calculate the Cramer-Rao lower bound for the minimal variance of an unbiased
estimator of h(θ) = θ.
2
First we calculate the Fisher Information as follows:
log f(x; θ) = log θ + x log(1− θ)
∂
∂θ
log f(x; θ) =
1
θ
− x
1− θ
∂2
∂θ2
log f(x; θ) = − 1
θ2
− x
(1− θ)2
The Fisher information in a single sample is:
IX1(θ) = −E
[
∂2
∂θ2
log f(x; θ)
]
=
1
θ2
+
E(X)
(1− θ)2
=
1
θ2
+
(1− θ)
θ(1− θ)2
=
1
θ2
+
1
θ(1− θ)
=
(1− θ)
θ2(1− θ) +
θ
θ2(1− θ)
=
1
θ2(1− θ)
Hence the Fisher information for the whole sample is
IX(θ) = nIX1(θ) =
n
θ2(1− θ) .
Then notice that
∂
∂θ
h(θ) = 1.
Hence the Cramer-Rao lower bound is(
∂
∂θ
h(θ)
)2
IX(θ)
=
θ2(1− θ)
n
d) Show that the variance of the UMVUE of h(θ) does not attain the Cramer-Rao
lower bound found in part (c). Also, show that for the parameter τ(θ) = 1
θ
the
bound is attainable.
To show that the Cramer-Rao bound is attainable (or not attainable)
we look at the score function. First the likelihood is:
L(X, θ) =
n∏
i=1
θ(1− θ)Xi = θn(1− θ)
∑n
i=1Xi
3
with log-likelihood
`(X, θ) = logL(X, θ) = n log θ +
n∑
i=1
Xi log(1− θ)
Therefore, we can write the score as:
V (X, θ) =
∂
∂θ
logL(X, θ)
=
n
θ
+
−∑ni=1Xi
1− θ
=
n
θ
− nX¯
1− θ
=
1
θ(1− θ)
[
n(1− θ)− nX¯θ
]
For τ(θ) = 1
θ
:
V (X, θ) =
n
θ(1− θ)
[
1− θ(1 + X¯)
]
= − n
(1− θ)
[
(1 + X¯)− 1
θ
]
but for h(θ) = θ we multiply throughout by θ
θ
to obtain:
V (X, θ) =
n
θ(1− θ)
[
1− X¯θ − θ
]
e) Determine the MLE hˆ of h(θ).
The score was shown to be:
V (X, θ) =
n
θ
− nX¯
1− θ
which leads to
n(1− θ)− n(1− θ)X¯ = 0
1− θ − X¯θ = 0
1 = θ(X¯ + 1)
Hence
θˆ =
1
1 + X¯
.
f) Suppose that the sample size n = 6 and the sample mean x¯ = 3. Compute the
numerical value for the UMVUE in part (b) and the MLE in part (e). What
would happen to these values as n → ∞ but the sample mean x¯ remained the
same. Explain your answer.
The UMVUE is
ĥ(θ)umvue =
6− 1
6− 1 + 3× 6 =
5
23
≈ 0.217
4
The MLE is
ĥ(θ)mle =
1
1 + 3
=
1
4
= 0.25
As n → ∞ the UMVUE would converge to the MLE at the value
1
1+3
= 0.25 since
lim
n→∞
n− 1
n− 1 +∑ni=1Xi = limn→∞ 1−
1
n
1− 1
n
+ X¯
=
1
1 + X¯
g) Consider testing H0 : θ ≥ 0.7 versus H1 : θ < 0.7 with a 0-1 loss in the Bayesian
setting with the prior τ(θ) = 12θ2(1−θ). What is your decision when n = 3 and∑3
i=1 xi = 2. You may use:∫ 1
0.7
x6(1− x)4dx = 0.000536
Note: The continuous random variable X has a beta density f with parameters
α > 0 and β > 0 if
f(x;α, β) =
1
B(α, β)
xα−1(1− x)β−1, x ∈ (0, 1)
where
B(α, β) =
∫ 1
0
xα−1(1− x)β−1dx = Γ(α)Γ(β)
Γ(α + β)
,
and
Γ(α + 1) = αΓ(α) = α!
First we need to compute the posterior by observing that
h(θ|x) ∝ 12θ2(1− θ)× θ3(1− θ)2 ∝ θ5(1− θ)3
which implies that
θ|X ∼ Beta(6, 4).
Hence we are interested in computing the posterior probability
P (θ ≥ 0.7|X) =
∫ 1
0.7
1
B(6, 4)
θ5(1− θ)3dθ
=
Γ(6 + 4)
Γ(6)Γ(4)
×
∫ 1
0.7
θ5(1− θ)3dθ
=
9!
5!× 3! × 0.000504
= 504× 0.000536
= 0.270
We compare this posterior probability with 0.5 since we are dealing
with a 0-1 loss. Since this probability is smaller than 0.5 we must
reject H0.
5
2. Let X1, X2, . . . , Xn be independent random variables with density
f(x; θ) =
α
θα
xα−1, 0 ≤ x ≤ θ,
where α is a known constant and θ > 0 is an unknown parameter. Let T =
max{X1, . . . , Xn} = X(n) be the maximum of the n observations.
a) Show that T = X(n) is a sufficient statistic for the parameter θ.
First calculate the likelihood:
L(X, θ) =
n∏
i=1
α
θα
Xα−1i I(Xi,∞)(θ) =
αn
θnα
( n∏
i=1
Xi
)α−1
I(X(n),∞)(θ)
Thus T = X(n) is sufficient by the Neyman Fisher Factorization Crite-
rion.
b) Show that the density of T is
fT (t; θ) =
nαtnα−1
θnα
, 0 ≤ t ≤ θ.
Hint: Compute the CDF first by using
P (T < t) = P (X1 < t,X2 < t, . . . , Xn < t).
For 0 ≤ x ≤ θ we have,
FX(x; θ) = P (X ≤ x) =
∫ x
0
α
θα
yα−1dy =
[
α
θα
yα
α
]y=x
y=0
=
(
x
θ
)α
Hence,
FT (t; θ) = P (T ≤ t)
= P (X1 ≤ t,X2 ≤ t, . . . , Xn ≤ t)
= P (X1 ≤ t)n
=
(
t
θ
)nα
for 0 ≤ t ≤ θ. Then by differentiation we obtain the PDF
fT (t, θ) =
d
dt
(
t
θ
)nα
=
nαtnα−1
θnα
, 0 ≤ t ≤ θ.
otherwise zero.
c) Find the MLE of θ and provide justification. Hint: Sketch the Likelihood
Function.
6
The MLE for θ is calculated by maximizing over all θ values the like-
lihood function:
L(X, θ) =
αn
θnα
( n∏
i=1
Xi
)α−1
I(X(n),∞)(θ)
The graph of this function is at zero until X(n) at which it jumps to
the value
αn
Xnα(n)
( n∏
i=1
Xi
)α−1
and then declines from hereafter. Therefore, X(n) must be the MLE.
You should plot this.
d) Show that the MLE is a biased estimator.
By calculating
E(X(n)) =
∫ θ
0
t · nαt
nα−1
θnα
dt
=
nα
θnα
∫ θ
0
tnαdt
=
nα
θnα
[
tnα+1
nα + 1
]t=θ
t=0
=
nα
nα + 1
θ
which does not equal θ and therefore the MLE is a biased estimator.
e) Show that T = X(n) is complete for θ.
The density of T is
fT (t; θ) =
nαtnα−1
θnα
Suppose that Eθ
[
g(T )
]
= 0 for all θ > 0. This implies that:∫ θ
0
g(t)
nαtnα−1
θnα
dt = 0 =
nα
θnα
∫ θ
0
g(t)tnα−1dt
for all θ > 0 must hold. Since nα
θnα
6= 0 we get∫ θ
0
g(t)tnα−1dt = 0
for all θ > 0. Differentiating both sides with respect to θ we get:
g(θ)θnα−1 = 0
for all θ > 0. This implies that g(θ) = 0 for all θ > 0. This also means
that Pθ(g(T ) = 0) = 1 and T = X(n) must be complete.
7
f) Hence or otherwise, determine the UMVUE of θ.
From part d) we know that
W =
nα + 1
nα
X(n)
is an unbiased estimator for θ. Thus by the Lehmann-Scheffe theorem,
θˆ = E(W |X(n)) = nα + 1
nα
X(n)
is the unique UMVUE of θ.
8