ECONOMETRICS
LECTURE 5A
Trang Le
LECTURE SLIDES TOPICS
OLS asymptotics
Data scaling
More on functional form
Average Partial Effects
Bad Controls
Key references
Wooldridge Chs 5.1-5.2 (excluding 5.2a), Ch 6.1 and 6.2
2
OLS PROPERTIES - OVERVIEW
So far properties derived for OLS hold for any sample
size – finite sample properties
Expected values/unbiasedness under MLR.1 – MLR.4
Variance formulas under MLR.1 – MLR.5
Gauss-Markov Theorem under MLR.1 – MLR.5
Exact sampling distributions/tests under MLR.1 – MLR.6
Properties of OLS that hold in large samples –
asymptotic properties
Consistency under MLR.1 – MLR.4
Asymptotic normality/tests under MLR.1 – MLR.5
MLR.6 no longer required
3
WHAT ARE ASYMPTOTIC PROPERTIES?
Remember the population line (image this image has
infinite data points)
4
Population regression line
1 = −1.5
WHAT ARE ASYMPTOTIC PROPERTIES?
With very few observations an “unlucky“ draw can lead
to very bad estimation
5
WHAT ARE ASYMPTOTIC PROPERTIES?
As the sample gets bigger even with an unlucky draw we
are never so far away from the real parameters
6
WHAT ARE ASYMPTOTIC PROPERTIES?
In statistical terms this means that as the sample gets
bigger the variance of the OLS estimates gets smaller
(̂1)=3 > (̂1)=6
We can see it mathematically:
̂1 = 2∑=1 ( − ̅)2
The noise remains the same but the signal increases
7
CONSISTENCY
Informal definition:
If we continue increasing the sample can we get to the point
where (̂1)=∞ = 0 and so ̂1 is no longer a random
variable but is just equal to 1.
Formal definitionPr � − < → 1 > 0 → ∞
Alternative shorthand notation: � =
Estimator converges in probability to the true value
Interpretation:
Consistency probability that estimate is arbitrarily close to
true population value can be made arbitrarily high by
increasing sample size 8
CONSISTENCY...
9
CONSISTENCY OF OLS
Theorem 5.1 (Consistency of OLS)
− ⇒ ̂ = , = 0,1, … ,
Special case of simple regression model (can show)
̂1 = 1 + 1, 1
1, & 1 are population quantities
As 1 = 0 ⇒ 1, = 0 ⇒ ̂1 is a consistent
estimator
10
ASYMPTOTIC NORMALITY
In practice, normality assumption MLR.6 often
questionable
If MLR.6 does not hold results of t or F tests may be wrong
Fortunately OLS estimates are normal in large samples
even without MLR.6
Usual t or F tests still work approximately if sample size is
large enough
Theorem 5.2 (Asymptotic normality of OLS)
Under MLR.1-MLR.5
̂ −
̂
~ 0,1
�2 = 2 11
ASYMPTOTIC NORMALITY...
Practical consequences
Construct t-statistics as before
In large samples t-tests are valid irrespective of whether
MLR.6 holds or not
Even when MLR.6 holds t-distribution is close to N(0,1)
distribution in large samples
Similarly proceed as before with confidence intervals & F-tests
Note: MLR.1 – MLR.5 are still necessary
Especially homoskedasticity
12
MODELLING INFANT BIRTHWEIGHT – SMALL SAMPLE
13
Relate birthweight to cigarette smoking & family
income
�= 116.97(1.05)− .63.092 + .093.029
= 1,388,2 = .0298
Does distribution of look normal?
Does it matter for inference here?
What happens if use only first 694 obs?
�= 116.65(1.5)− .58.39 + .09.02
= 69,2 = .039
0
.0
05
.0
1
.0
15
.0
2
D
en
si
ty
0 100 200 300
birth weight, ounces
MODELLING INFANT BIRTHWEIGHT - UNITS
What happens if birthweight is converted from ounces to
grams? (Hint: 1 ≈ 28 )
�= 116.97(1.05)− .63.092 + .093.029
= 1,388,2 = .0298
= −5.06
Compared with
�= 3275.3(29.)− 12.982.56 + 2.60.82
= 1,388,2 = .0298
= −5.06
What if change of units occurred when had specified log = 0 + 1 + 2 + ? 14
MORE ON FUNCTIONAL FORM: LOGS
Convenient percentage/elasticity interpretation
Slope coefficients of logged variables are invariant to
rescalings
Taking logs often useful
Eliminates/mitigates problems with outliers
Helps to secure normality & homoskedasticity
Variables that should not be logged
Those measured in units such as years or percentage points
Those that take on zero or negative values
15
MORE ON FUNCTIONAL FORM: QUADRATICS
Consider wage equation
�= 3.73(.35)+ .298.0 − .00612.0009
= 526,2 = .093
Marginal effect of experience
∆
∆
= .298 − 2 .0061
Marginal effect depends on level of experience
It makes no sense to evaluate each of the estimates
separately - what would be the ceteris paribus assumption?
16
MORE ON FUNCTIONAL FORM: QUADRATICS...
17
Specification allows either convex
(U shaped) or concave (inverted U)
relationship
Estimates indicate concavity
Wage maximum (turning point) wrt
experience
∗ = ̂12̂2 = 0.2982(.0061)
≈ 2.
Does this mean return to
experience becomes negative
after 24.4 years?
Reasonable approximation or
indication of mis-specification?
MORE ON FUNCTIONAL FORM: INTERACTIONS
Consider following housing price model
= 0 + 1 + 2 + 3 ∗ + 4 +
⇒
∆
∆
= 2 + 3
Similar to quadratics, interaction terms complicate
interpretation of parameters
Effect of number of bedrooms depends on level of square
footage
2 represents effect when = 0
1 represents effect when = 0
3 extra effect of on price for each added
18
MORE ON FUNCTIONAL FORM: INTERACTIONS
Similarly you can say that: ∆
∆
= 1 + 3
Example: House with 2 bedrooms
What is the effect of increasing the size of the house by 1?
∆
∆
= 1 + 23
Example: House with 3 bedrooms
What is the effect of increasing the size of the house by 1?
∆
∆
= 1 + 33 19
AVERAGE PARTIAL EFFECTS
In many models size of the effect depends on values of
one or more explanatory variables
Average partial effect (APE) provides a summary measure
APE is the effect calculate for the average regressor
A reporting issue so no right or wrong approach
Could also report partial effect at a particular value
Most importantly be clear in what you’re reporting
20
AVERAGE PARTIAL EFFECTS...
Recall interaction example
= 0 + 11 + 22 + 31 ∗ 2 +
Is effect of 2 on when 1 = 0 (i.e. 2) of interest?
Reparametrization may be useful in order to directly
estimate interesting partial effects
= 0 + 11 + 22 + 3(1−1) ∗ (2 − 2) +
⇒
∆
∆2
= 2 + 3(1−1)
Now 2is effect of 2 on when 1 = 1 & we choose 1
1 could be sample mean of 1 ⇒ 2 is APE for this example 21
AVERAGE PARTIAL EFFECTS...
Advantages of reparametrization
Easy interpretation of parameters
Choice of 1 at discretion of researcher – could be any
interesting value
Before & after reparametrization are not different
models
Simply isolating different parameter combinations to estimate
From previous example 2 = 2 + 31
Easy to estimate using original model but calculation of se 2 less immediate
22
OVERCONTROLLING AND BAD CONTROLS
One way of dealing with omitted variable bias is to
control for many observables.
But is possible to control for too may variables
Example: Let our objective be to find what is the effect of
beer taxes on traffic fatalities
Two important details:
1. This question is about causality (not goodness of fit)
2. We hypothesize that the causal links are:
→ →
23
OVERCONTROLLING AND BAD CONTROLS
Data on different cities in Australia (with different tax
rates on beer)
Two econometric models:
= 0 + 1 +
= 0 + 1 + 2 +
Let say that our hypothesis is true then (and ASMPT 1-4):1. 1<0 (the effect of tax)
2. But 1 = 0 (and 2>0 ). Because the policy has and
effect through beer consumption 24
OVERCONTROLLING AND BAD CONTROLS
We call a bad control
Given that our objective is to find the effect of the policy
we shouldn’t control for (don’t control for
mechanisms)
We can control for if instead our objective is to
predict
25
PREDICTION
Generating predictions from multiple regression models is
straightforward
In general case where
= 0 + 11 + ⋯+ +
And want to predict for 1 = 1, … , = then
0 = 1 = 1, … , = = 0 + 11 + ⋯+
�0 = ̂0 + ̂11 + ⋯+ ̂
What is less clear is ( �0)
But our reparametrization approach helps again
0 is a linear function of parameters – see Week #4 26
PREDICTION…
Our reparameterization approach helps again. Define
θ0 = 0 + 11 + ⋯+
Then 0 = θ0 − 11 −⋯− , and we could rewrite
the original model as
= θ0 + 1(1 − 1) + ⋯+ ( − ) +
Regression of on �1 = 1 − 1, … , � = − would
provide estimates of θ0 and , = 1, … ,
Most importantly would provide ( �0) 27
FINAL THOUGHTS
Have dealt with asymptotic theory in a somewhat
superficial manner
Details are somewhat complicated
BUT important implications in practice that are clear
This is all the material that will be covered in the midterm