R代写-L3

Econometrics: L3
OLS Asymptotics
Sung Y. Park
Chung-Ang Univ.
Motivation
◮ We considered small sample, finite sample or exact properties
of the OLS estimator so far.
◮ Gauss-Markov theorem is a finite sample property: it holds for
any sample size n
◮ It is also very useful to see the asymptotic properties or large
sample properties.
◮ For example, we know that t-stat does not follow Student’s t
distribution when A6 (normality of error term) is violated.
◮ However, we can show t and F statistics have approximately t
and F distribution in large sample sizes
Consistency
◮ Consistency is minimal requirement for an estimator
◮ Consider OLS estimator βˆn
◮ For each n (sample size), βˆn has a probability distribution
◮ βˆn is a consistent estimator ⇒ as sample size (n) increases,
βˆn becomes more and more tightly distributed around β. As
n→∞, the distribution of βˆn is degenerated to the single
point β

plim βˆn = β, or βˆj →p β
This means Prob(|βˆn − β| > ǫ)→ 0 as n→∞ for every
ǫ > 0.
Consistency
Theorem
Under A1-A4, the OLS estimator βˆj is consistent for βj , for all
j = 1, 2, · · · , k.
Sketch of proof: Consider the slope estimator, βˆ1 in the simple
regression model, yi = β0 + β1xi1 + ui :
βˆ1 = (
n∑
i=1
(xi1 − x¯1)yi)/(
n∑
i=1
(xi1 − x¯1)2)
= β1 + (n
−1
n∑
i=1
(xi1 − x¯1)ui )/(
n∑
i=1
(xi1 − x¯1)2)
Consistency
By Law of large number numerator and denominator converge in
probability to Cov(x1, u) and Var(x1), respectively. Then
plim βˆ1 = β1 + Cov(x1, u)/Var(x1)
= β1 (why?)
• Note that A4 (E (u|x1) = 0) implies that x1 and u are
uncorrelated.
Consistency
Histogram: n = 10
Sample.Mean
Fr
eq
ue
nc
y
−1.0 −0.5 0.0 0.5 1.0
0
40
80
12
0
Histogram: n = 50
Sample.Mean
Fr
eq
ue
nc
y
−1.0 −0.5 0.0 0.5 1.0
0
40
80
12
0
Histogram: n = 100
Sample.Mean
Fr
eq
ue
nc
y
−1.0 −0.5 0.0 0.5 1.0
0
40
80
Histogram: n = 300
Sample.Mean
Fr
eq
ue
nc
y
−1.0 −0.5 0.0 0.5 1.0
0
50
10
0
15
0
Histogram: n = 500
Sample.Mean
Fr
eq
ue
nc
y
−1.0 −0.5 0.0 0.5 1.0
0
20
40
60
80
Histogram: n = 1000
Sample.Mean
Fr
eq
ue
nc
y
−1.0 −0.5 0.0 0.5 1.0
0
40
80
12
0
Histogram: n = 1500
Sample.Mean
Fr
eq
ue
nc
y
−1.0 −0.5 0.0 0.5 1.0
0
50
10
0
15
0
Histogram: n = 2000
Sample.Mean
Fr
eq
ue
nc
y
−1.0 −0.5 0.0 0.5 1.0
0
20
40
60
80
Histogram: n = 5000
Sample.Mean
Fr
eq
ue
nc
y
−1.0 −0.5 0.0 0.5 1.0
0
40
80
12
0
Consistency
LLN<-function(mu,sigma,n,b)
{
data1<-rnorm(n*b)
data<-sigma*data1+mu
m<-rep(0,b)
data<-matrix(data,ncol=b)
for (i in 1:b){
m[i]<-c(mean(data[,i]))
}
return(m)
}
# Here we make a graph
mu <- 0
sigma <- 1
nseq <- c(10, 50, 100, 300, 500, 1000, 1500, 2000, 5000)
par(mfrow=c(3,3))
for (i in 1:9){
Sample.Mean <- LLN(mu, sigma, nseq[i], 500)
hist(Sample.Mean,xlim=c(-1,1), main=paste("Histogram: n = ",nseq[i]))
}
Inconsistency in OLS
Consider the case of omitted variable problem. The true model,
y = β0 + β1x1 + β2x2 + v ,
satisfies A1-A4. If we omit x2 from the regression and do simple
regression y on x1, then the error of this regression is given by
u = β2 + v . Then
plim β˜1 = β1 + β2δ1,
where
δ1 = Cov(x1, x2)/Var(x1).
If x1 and x2 are uncorrelated ⇒ δ1 = 0 ⇒ β˜2 is consistent (not
necessarily unbiased). (why?)
Inconsistency in OLS
Consider the following case:
y = β0 + β1x1 + β2x2 + u,
where x2 and u are uncorrelated but x1 and u are correlated. Then
which estimator is inconsistent?
• The answer is generally both OLS estimators, βˆ1 and βˆ2, are
inconsistent.
• If x1 and x2 are uncorrelated, then βˆ2 is consistent.
Asymptotic normality
◮ Need to know the sampling distribution of OLS estimator for
statistical inference.
◮ Exact normality of OLS estimator ⇐ normality of distribution
of the error u.
◮ What if u is not normally distributed? ⇒ t-stat and F-stat do
not follow Student’s t and F distributions anymore.
◮ So, if u is non-normal, we cannot perform statistical tests? ⇒
We can do it if we have large enough samples.
◮ When sample size, n, goes to infinity, we can show that OLS
estimator converges to normal distribution (asymptotic
normality).
Asymptotic Normality
Theorem
Under A1-A5,
1.

n(βˆj − βj) ∼a N(0, σ2/a2j ), where σ2/a2j > 0 is the
asymptotic variance of

n(βˆj − βj); for the slope coefficients,
a2j = plim (n
−1
∑n
i=1 rˆ
2
ij ), where rˆ
2
ij are the residuals from the
regressing xj on the other independent variables.
2. σˆ2 is a consistent estimator of σ2 = Var(u)
3. For each j
(βˆj − βj)/se(βˆj ) ∼a N(0, 1)
Asymptotic normality
Histogram: n = 10
Sample.Mean
Fr
eq
ue
nc
y
−2 0 1 2 3 4
0
20
60
10
0
Histogram: n = 50
Sample.Mean
Fr
eq
ue
nc
y
−2 −1 0 1 2 3
0
20
60
10
0
Histogram: n = 100
Sample.Mean
Fr
eq
ue
nc
y
−3 −2 −1 0 1 2 3
0
40
80
Histogram: n = 300
Sample.Mean
Fr
eq
ue
nc
y
−2 0 1 2 3 4
0
20
60
10
0
Histogram: n = 500
Sample.Mean
Fr
eq
ue
nc
y
−2 0 2 4
0
20
60
10
0
Histogram: n = 1000
Sample.Mean
Fr
eq
ue
nc
y
−3 −2 −1 0 1 2 3
0
20
60
10
0
Histogram: n = 1500
Sample.Mean
Fr
eq
ue
nc
y
−3 −1 0 1 2 3
0
20
60
10
0
Histogram: n = 2000
Sample.Mean
Fr
eq
ue
nc
y
−3 −1 0 1 2 3
0
20
60
10
0
Histogram: n = 5000
Sample.Mean
Fr
eq
ue
nc
y
−3 −2 −1 0 1 2 3
0
20
60
10
0
Asymptotic normality
CLT<-function(r,n,b)
{
data<-rchisq(n*b, r)
x<-rep(0,b)
data<-matrix(data,ncol=b)
for (i in 1:b){
x[i]<-(sqrt(n)*(mean(data[,i])-r))/(sqrt(2*r))
}
return(x)
}
# Here we make a graph
r <- 2
nseq <- c(10, 50, 100, 300, 500, 1000, 1500, 2000, 5000)
par(mfrow=c(3,3))
for (i in 1:9){
Sample.Mean <- CLT(r, nseq[i], 500)
hist(Sample.Mean, main = paste("Histogram: n = ",nseq[i]))
}
Asymptotic normality
◮ Note that the normality assumption A6 has been dropped.
◮ One may wonder that (iii) of the above Theorem is different
from previous (exact) tn−k−1. However, this difference is
irrelevant. One can also write (why?)
(βˆj − βj)/se(βˆj ) ∼a tn−k−1
◮ Note that when the sample size is not large enough, t
distribution is a poor approximation to the distribution of
t-stat if u is non-normal.
◮ Then how big the sample size should be?
◮ Note that the above theorem need the condition of
homoskedasticity. Without homoskedasticity theorem does
not hold.
Asymptotic Efficiency
Theorem
Under A1-A5, let β˜j estimators that solve equations of
n∑
i=1
gj(xi)(yi−β˜0−β˜1xi1−β˜2xi2−· · ·−β˜kxik) = 0, j = 1, 2, · · · , k,
and let βˆj the OLS estimators. Then for j = 0, 1, 2, · · · , k, the
OLS estimators have the smallest asymptotic variances, i.e.,
Avar

n(βˆj − βj) ≤ Avar

n(β˜j − βj).  