ASSIGNMENT 2
For each part of each question below, state whether or not the answer given by ChatGPT is
correct or incorrect. To be correct, both the final answer and all workings/reasoning used
to obtain it must be provided and correct. Otherwise the answer is incorrect.
If you decide that ChatGPT’s answer is correct, simply write ”Correct”. For example, if you
think question 1 part (a) is correct, write
1.(a) Correct.
If you decide that ChatGPT’s answer is incorrect, identify the exact location where
ChatGPT makes its first error and write 1-2 sentences to explain why. Then provide the
correct answer, including all workings/reasoning. For example, if you think question 1 part
(a) is incorrect, write
1.(a) Incorrect. In first equality (E[X] =
∫∞
−∞ xFX(x)dx), ChatGPT’s answer incorrectly
uses the distribution function FX(x) instead of the density function fX(x). From that point,
we obtain the correct answer as follows:
E[X] =
∫ ∞
−∞
xfX(x)dx =
∫ 1
0
x× 1dx = [x2/2]10 = 1/2− 0 = 1/2.
To submit your assignment you should scan/photograph your answers and then upload to
Blackboard using the link provided. Answers may be handwritten or typed. Please make sure
that your answers are legible prior to submission.
1. Suppose that we have a random sample X1, X2, ..., Xn from with mean µ and variance σ
2
and let n > 5,m = n− 5. Given the estimators
µ̂n =
1
n
n∑
i=1
Xi, µ̂m =
1
m
m∑
i=1
Xi,
we have
n(µ̂n − µ)2
σ2
+
m(µ̂m − µ)2
σ2
d−→ χ22.
True or false? Why? (10 marks)
This statement is false.
To see why, let’s first consider the fact that the sum of squares of independent standard
normal random variables follows a chi-squared distribution with the degrees of freedom
equal to the number of variables being summed.
Given the estimators:
µˆn =
1
n
n∑
i=1
Xi, µˆm =
1
m
m∑
i=1
Xi
We want to evaluate the expression:
n(µˆn − µ)2
σ2
+
m(µˆm − µ)2
σ2
Notice that this expression resembles the sum of two terms, each representing the square
of the difference between the sample mean and the population mean, scaled by n and m
respectively, and then normalized by the variance σ2.
However, the issue here is that µˆn and µˆm are not independent, as they share some
common observations. Thus, we cannot directly apply the result for sums of squares of
independent standard normal random variables.
Therefore, the expression does not converge in distribution to a chi-squared distribution
with 2 degrees of freedom (χ22).
In summary, the statement is false because the provided expression does not have the
convergence property stated.
2. Given a random sample X1, X2, ..., Xn from a uniform distribution with mean µ and
variance σ2 and
X =
1
n
n∑
i=1
Xi,
we have 2
√
n(X − µ) d−→ N (0, 4σ2). True or false? Why? (10 marks)
This statement is true.
Given a random sample X1, X2, . . . , Xn from a uniform distribution with mean µ and
variance σ2, the central limit theorem (CLT) tells us that the sample mean X converges
in distribution to a normal distribution as n approaches infinity.
Specifically, we have:
√
n(X − µ) d−→ N (0, σ2)
Multiplying both sides by 2 yields:
2
√
n(X − µ) d−→ N (0, 4σ2)
Therefore, the statement is true. The sample mean X converges in distribution to a
normal distribution with mean 0 and variance 4σ2, as the sample size n grows.
3. Consider the regression model
Yi = β1 + β2Xi + Ui, E[Ui|Xi] = 0, E[U2i |Xi] = σ2
and suppose that you have a bivariate random sample of (X1, Y1), (X2, Y2), ..., (Xn, Yn).
(a) The OLS estimator of β2 is
βˆ2 =
∑
i(Xi −X)(Yi − Y )∑
i(Xi −X)2
.
Let
β˜2 =
∑
i(Xi −X)Yi∑
i(Xi −X)2
.
Show that βˆ2 = β˜2 (3 marks).
To show that βˆ2 = β˜2, we’ll start by expressing both estimators and then show they
are equivalent.
1. **OLS Estimator βˆ2:
βˆ2 =
∑
i(Xi −X)(Yi − Y )∑
i(Xi −X)2
2. **Alternative Estimator β˜2:
β˜2 =
∑
i(Xi −X)Yi∑
i(Xi −X)2
Now, let’s manipulate β˜2 to match βˆ2:
β˜2 =
∑
i(Xi −X)Yi∑
i(Xi −X)2
=
∑
i(Xi −X)(Yi − Y + Y )∑
i(Xi −X)2
(adding and subtracting Y )
=
∑
i(Xi −X)(Yi − Y ) +
∑
i(Xi −X)Y∑
i(Xi −X)2
=
∑
i(Xi −X)(Yi − Y ) + Y
∑
i(Xi −X)∑
i(Xi −X)2
(pulling out Y )
=
∑
i(Xi −X)(Yi − Y )∑
i(Xi −X)2
+
Y
∑
i(Xi −X)∑
i(Xi −X)2
=
∑
i(Xi −X)(Yi − Y )∑
i(Xi −X)2
+ Y
∑
i(Xi −X)∑
i(Xi −X)2
(dividing by
∑
i
(Xi −X)2)
=
∑
i(Xi −X)(Yi − Y )∑
i(Xi −X)2
+ Y
0∑
i(Xi −X)2
(since
∑
i
(Xi −X) = 0)
=
∑
i(Xi −X)(Yi − Y )∑
i(Xi −X)2
Hence, we have shown that β˜2 = βˆ2. Therefore, both estimators are equivalent.
(b) Show that βˆ1 is BLUE (12 marks)
To show that βˆ1 is the Best Linear Unbiased Estimator (BLUE), we need to demon-
strate two properties:
1. **Linearity**: βˆ1 is a linear function of the observed values Yi. 2. **Unbiased-
ness**: βˆ1 has an expected value equal to the true parameter β1.
Let’s address each property:
1. Linearity: The OLS estimator βˆ1 is given by:
βˆ1 = Y − βˆ2X
Both Y and βˆ2 are linear functions of the observed values Yi and Xi respectively.
Therefore, βˆ1 is also a linear function of the observed values Yi, satisfying the linearity
property.
2. Unbiasedness: We want to show that E[βˆ1] = β1.
Given that: - βˆ2 =
∑
i(Xi−X)(Yi−Y )∑
i(Xi−X)2
, - Y = 1
n
∑
i Yi, - X =
1
n
∑
iXi,
we can express βˆ1 as:
βˆ1 =
1
n
∑
i
Yi −
∑
i(Xi −X)(Yi − Y )∑
i(Xi −X)2
· 1
n
∑
i
Xi
Expanding this expression, we get:
βˆ1 =
1
n
∑
i
Yi −
∑
i(Xi −X)Yi∑
i(Xi −X)2
Now, let’s take the expected value of βˆ1:
E[βˆ1] = E
[
1
n
∑
i
Yi −
∑
i(Xi −X)Yi∑
i(Xi −X)2
]
=
1
n
∑
i
E[Yi]− E
[∑
i(Xi −X)Yi∑
i(Xi −X)2
]
Since E[Ui|Xi] = 0, E[Yi] = β1 + β2Xi. Therefore:
E[βˆ1] =
1
n
∑
i
(β1 + β2Xi)− E
[∑
i(Xi −X)Yi∑
i(Xi −X)2
]
Now, consider E
[∑
i(Xi−X)Yi∑
i(Xi−X)2
]
:
E
[∑
i(Xi −X)Yi∑
i(Xi −X)2
]
=
1∑
i(Xi −X)2
∑
i
(Xi −X)E[Yi]
=
1∑
i(Xi −X)2
∑
i
(Xi −X)(β1 + β2Xi)
=
β1
∑
i(Xi −X) + β2
∑
i(Xi −X)Xi∑
i(Xi −X)2
=
β1
∑
i(Xi −X)∑
i(Xi −X)2
+
β2
∑
i(Xi −X)Xi∑
i(Xi −X)2
=
β1
∑
i(Xi −X)∑
i(Xi −X)2
+ β2
∑
i(Xi −X)Xi∑
i(Xi −X)2
= β1
∑
i(Xi −X)∑
i(Xi −X)2
+ β2
∑
i(Xi −X)Xi∑
i(Xi −X)2
= β1
0∑
i(Xi −X)2
+ β2
∑
i(Xi −X)Xi∑
i(Xi −X)2
= 0 + β2
∑
i(Xi −X)Xi∑
i(Xi −X)2
= β2
∑
i(Xi −X)Xi∑
i(Xi −X)2
Therefore:
E[βˆ1] =
1
n
∑
i
(β1 + β2Xi)− β2
∑
i(Xi −X)Xi∑
i(Xi −X)2
=
1
n
∑
i
β1 +
β2
n
∑
i
Xi − β2
∑
i(Xi −X)Xi∑
i(Xi −X)2
= β1 + β2X − β2
∑
i(Xi −X)Xi∑
i(Xi −X)2
= β1 + β2X − β2
∑
i(Xi −X)Xi∑
i(Xi −X)2
= β1 + β2X − β2
∑
i(Xi −X)Xi∑
i(Xi −X)2
= β1 + β2X − β2
∑
i(Xi −X)Xi∑
i(Xi −X)2
= β1
Hence, βˆ1 is unbiased.
Therefore, βˆ1 is BLUE.
4. Consider the regression model
Yi = βXi + Ui, E[Ui|Xi] = 0, E[U2i |Xi] = σ2
and suppose that you have a bivariate random sample of (X1, Y1), (X2, Y2), ..., (Xn, Yn)
(a) Derive the OLS estimator and show that
βˆ = β +
1
n
∑
iXiUi
1
n
∑
iX
2
i
.
(10 marks)
To derive the OLS estimator for the regression model Yi = βXi + Ui, let’s minimize
the sum of squared residuals, which is given by:
Q(β) =
n∑
i=1
(Yi − βXi)2
To minimize Q(β), we take the derivative with respect to β and set it equal to zero:
dQ(β)
dβ
= −2
n∑
i=1
Xi(Yi − βXi) = 0
Expanding this expression, we get:
n∑
i=1
XiYi − β
n∑
i=1
X2i = 0
Now, solving for β:
β
n∑
i=1
X2i =
n∑
i=1
XiYi
βˆ =
∑n
i=1XiYi∑n
i=1X
2
i
Now, let’s express βˆ in terms of Ui to show the desired expression:
βˆ =
∑n
i=1XiYi∑n
i=1X
2
i
=
∑n
i=1Xi(βXi + Ui)∑n
i=1X
2
i
(using the regression model)
=
∑n
i=1 βX
2
i +
∑n
i=1XiUi∑n
i=1X
2
i
= β
∑n
i=1X
2
i∑n
i=1X
2
i
+
∑n
i=1XiUi∑n
i=1X
2
i
= β +
1
n
∑n
i=1XiUi
1
n
∑n
i=1X
2
i
Hence, we have shown that βˆ = β +
1
n
∑n
i=1XiUi
1
n
∑n
i=1X
2
i
, as desired.
(b) Show that the OLS estimator is consistent (5 marks)
To show that the OLS estimator βˆ is consistent, we need to demonstrate that as the
sample size n approaches infinity, βˆ converges in probability to the true parameter
β. In other words, we need to show that:
lim
n→∞
Pr
(
|βˆ − β| > ϵ
)
= 0
for any positive value of ϵ.
Recall that the OLS estimator βˆ is given by:
βˆ =
∑n
i=1XiYi∑n
i=1X
2
i
First, let’s express the numerator and the denominator in terms of the true parameter
β and the error term Ui:
n∑
i=1
XiYi =
n∑
i=1
Xi(βXi + Ui)
= β
n∑
i=1
X2i +
n∑
i=1
XiUi
= β
n∑
i=1
X2i +
n∑
i=1
XiUi
Now, let’s rewrite βˆ using these expressions:
βˆ =
β
∑n
i=1X
2
i +
∑n
i=1XiUi∑n
i=1X
2
i
Dividing both the numerator and the denominator by n, we get:
βˆ = β +
1
n
∑n
i=1XiUi
1
n
∑n
i=1X
2
i
Since E[Ui|Xi] = 0 and E[U2i |Xi] = σ2, by the law of large numbers, 1n
∑n
i=1XiUi
and 1
n
∑n
i=1X
2
i converge in probability to 0 and E[X
2
i ] respectively as n approaches
infinity.
Therefore, βˆ converges in probability to β as n approaches infinity, which implies
that the OLS estimator βˆ is consistent.
(c) Show that
√
n(βˆ − β) = A
B
where A =
√
n(W − E[Wi]), Wi = XiUi and B = 1n
∑
iX
2
i . (5 marks)
To show that
√
n(βˆ − β) = A
B
where A =
√
n(W − E[Wi]), Wi = XiUi, and B = 1n
∑
iX
2
i , let’s first express βˆ in
terms of Wi and B:
βˆ = β +
1
n
∑n
i=1XiUi
1
n
∑n
i=1X
2
i
= β +
1
n
∑n
i=1Wi
1
n
∑n
i=1X
2
i
Now, let’s express W in terms of Wi:
W =
1
n
n∑
i=1
Wi
Now, we’ll compute W and E[Wi] to find A:
W =
1
n
n∑
i=1
Wi
=
1
n
n∑
i=1
XiUi
E[Wi] = E[XiUi]
Substituting E[Wi] and W into A:
A =
√
n
(
1
n
n∑
i=1
XiUi − E[XiUi]
)
Now, let’s express B in terms of Xi to find B:
B =
1
n
n∑
i=1
X2i
Now, we’ll divide A by B to see if it matches
√
n(βˆ − β):
A
B
=
√
n
(
1
n
∑n
i=1XiUi − E[XiUi]
)
1
n
∑n
i=1X
2
i
=
1√
n
(
∑n
i=1XiUi − nE[XiUi])
1
n
∑n
i=1X
2
i
=
∑n
i=1XiUi − nE[XiUi]√
n
∑n
i=1X
2
i
Now, comparing with
√
n(βˆ − β), we can see that:
√
n(βˆ − β) =
∑n
i=1XiUi − nE[XiUi]√
n
∑n
i=1X
2
i
Thus, we’ve shown that
√
n(βˆ − β) = A
B
, where A =
√
n(W − E[Wi]) and B =
1
n
∑
iX
2
i .
(d) A converges in distribution to the random variable Z. What is the distribution of
Z? (10 marks)
Given that A converges in distribution to the random variable Z, we need to deter-
mine the distribution of Z.
Recall that A is defined as:
A =
√
n(W − E[Wi])
where W is the sample mean of Wi = XiUi, and E[Wi] = E[XiUi].
By the central limit theorem (CLT), if A converges in distribution to Z, then Z
follows a normal distribution as n approaches infinity.
The sample mean W can be expressed as:
W =
1
n
n∑
i=1
XiUi
The expected value of Wi = XiUi is E[XiUi] = E[Xi]E[Ui]. Since E[Ui] = 0 (given),
E[XiUi] = 0.
Therefore, E[Wi] = 0, and E[W ] = 0.
So,
A =
√
n(W − E[Wi]) =
√
nW
Now, applying the CLT, since W is the sample mean, as n approaches infinity, W
will approach a normal distribution.
Therefore, Z follows a normal distribution with mean 0 and variance equal to the
variance of W , which is Var(W ).
Thus, the distribution of Z is a normal distribution with mean 0 and variance
Var(W ).
(e) B converges in probability to the constant c. Find c. (5 marks)
Given that B converges in probability to the constant c, we need to find the value
of c.
Recall that B is defined as:
B =
1
n
n∑
i=1
X2i
To find c, we need to find the limit of B as n approaches infinity.
lim
n→∞
B = lim
n→∞
1
n
n∑
i=1
X2i
By the law of large numbers (LLN), as n approaches infinity, the sample mean of
X2i converges in probability to its expected value, E[X
2
i ].
Therefore, c = E[X2i ].
Hence, c is the expected value of X2i .
So, c = E[X2i ].
(f) Hence find the asymptotic distribution of βˆ. (10 marks)
To find the asymptotic distribution of βˆ, we have:
√
n(βˆ − β) = A
B
where A =
√
n(W − E[Wi]) and B = 1n
∑
iX
2
i .
Given that A converges in distribution to a random variable Z and B converges in
probability to a constant c, we can use the Slutsky’s theorem, which states that if
Xn
d−→ X and Yn p−→ c, where X is a random variable and c is a constant, then:
XnYn
d−→ cX.
Therefore, we can write:
√
n(βˆ − β) = A
B
d−→ Z
c
.
So, the asymptotic distribution of βˆ is the same as the distribution of Z
c
, where
Z follows a normal distribution with mean 0 and variance Var(W ), and c is the
expected value of X2i .
Therefore, the asymptotic distribution of βˆ is a normal distribution with mean 0
and variance Var(W )
c
.
5. Consider the probability function for the discrete random variable X given by
Prob[X = x] = (1− θ)x−1θ, x = 1, 2, 3, ...
for 0 < θ < 1 and a random sample X1, X2, ..., Xn. Note also that the random variable
X verifies E[X] = 1/θ.
(a) Derive the maximum likelihood estimator of θ (10 marks)
To derive the maximum likelihood estimator (MLE) of θ from the given probability
function for the discrete random variable X, we’ll construct the likelihood function
and find the value of θ that maximizes it.
The likelihood function L(θ) is the product of the probabilities of the observed
outcomes:
L(θ) =
n∏
i=1
Prob[Xi]
where Xi is the ith observed value in the sample.
Substituting the given probability function, we have:
L(θ) =
n∏
i=1
(1− θ)Xi−1θ
Taking the logarithm of the likelihood function (log-likelihood) simplifies the calcu-
lations:
lnL(θ) =
n∑
i=1
[(Xi − 1) ln(1− θ) + ln(θ)]
Now, to find the maximum likelihood estimator θˆ, we differentiate the log-likelihood
function with respect to θ, set the derivative equal to zero, and solve for θ:
d
dθ
lnL(θ) =
n∑
i=1
[
Xi − 1
1− θ −
1
θ
]
= 0
Multiplying through by (1− θ)θ to clear the fractions:
n∑
i=1
(Xi − 1)θ −
n∑
i=1
(1− θ) = 0
Rearranging terms:
n∑
i=1
Xiθ − nθ =
n∑
i=1
(Xi − 1)
Factoring out θ and solving for θˆ:
θˆ =
∑n
i=1(Xi − 1)∑n
i=1Xi − n
Therefore, the maximum likelihood estimator of θ is given by:
θˆ =
∑n
i=1(Xi − 1)∑n
i=1Xi − n
(b) Derive the asymptotic distribution of the maximum likelihood estimator of θ. (10
marks)
To derive the asymptotic distribution of the maximum likelihood estimator (MLE)
of θ, denoted θˆ, we can use the asymptotic properties of the MLE. Under certain
regularity conditions, the MLE is asymptotically normal with mean equal to the true
parameter value and variance equal to the inverse of the Fisher information.
Given the MLE θˆ derived previously:
θˆ =
∑n
i=1(Xi − 1)∑n
i=1Xi − n
We can rewrite it as:
θˆ =
∑n
i=1Xi − n∑n
i=1Xi − n
− n∑n
i=1Xi − n
This simplifies to:
θˆ = 1− n∑n
i=1Xi
Now, let’s denote X¯n =
1
n
∑n
i=1Xi, the sample mean. Then we have:
θˆ = 1− 1
X¯n
To find the asymptotic distribution of θˆ, we need to consider the asymptotic behavior
of X¯n as n approaches infinity. By the law of large numbers (LLN), X¯n converges in
probability to the true expected value of X, which is 1/θ.
Therefore, as n approaches infinity, θˆ converges in probability to:
θˆ
p−→ 1− 1
1/θ
= 1− θ
1
= 1− θ
Hence, the asymptotic distribution of θˆ is a point mass at 1− θ.
In summary, the asymptotic distribution of the maximum likelihood estimator θˆ of
θ is a point mass at 1− θ.