1/41
Applied Econometrics
ECON7810
Fall 2021
Lecture 9
Dr. Bei QIN
2/41
Regression with Panel Data
(SW Chapter 10)
Outline
1. Panel Data: What and Why
2. Panel Data with Two Time Periods
3. Fixed Effects Regression
4. Regression with Time Fixed Effects
5. Standard Errors for Fixed Effects Regression
6. Application to Drunk Driving and Traffic Safety
3/41
Panel Data: What and Why
(SW Section 10.1)
A panel dataset contains observations on multiple entities
(individuals, states, companies…), where each entity is
observed at least two or more points in time (not necessarily
consecutive)
Hypothetical examples:
· Data on 420 California school districts in 1999 and
again in 2000, for 840 observations total.
· Data on 48 U.S. states, each state is observed in 7 years,
for a total of 336 observations.
· Data on 1,000 individuals, in four different months, for
4,000 observations total.
4/41
5/41
Notation for panel data
A double subscript distinguishes entities (states) and time
periods (years)
i = entity (state), n = number of entities,
so i = 1,…,n
t = time period (year), T = number of time periods
so t =1,…,T
Data: Suppose we have a single regressor. The data are:
(Xit, Yit), i = 1,…,n, t = 1,…,T
6/41
Panel data notation, ctd.
Panel data with k regressors:
(X1it, X2it,…,Xkit, Yit), i = 1,…,n, t = 1,…,T
Some jargon…
· balanced panel: no missing observations, that is, all
variables are observed for all entities (states) and all time
periods (years)
7/41
Why are panel data useful?
With panel data we can control for factors that vary across
entities but do not vary over time,
· Could cause omitted variable bias if they are omitted,
· Are unobserved or unmeasured – and therefore cannot be
included in the regression using multiple regression
Here’s the key idea:
If an omitted variable does not change over time, then any
changes in Y over time cannot be caused by the omitted
variable.
8/41
Example of a panel data set:
Traffic deaths and alcohol taxes
Observational unit: a year in a U.S. state
· 48 U.S. states, so n = # of entities = 48
· 7 years (1982,…, 1988), so T = # of time periods = 7
· Balanced panel, so total # observations = 7´48 = 336
Variables:
· Traffic fatality rate (# traffic deaths in that state in that
year, per 10,000 state residents)
· Tax on a case of beer
· Other (legal driving age, drunk driving laws, etc.)
9/41
U.S. traffic death data for 1982:
Higher alcohol taxes, more traffic deaths?
10/41
U.S. traffic death data for 1988:
Higher alcohol taxes, more traffic deaths?
11/41
Why might there be more traffic deaths in states that have
higher alcohol taxes?
Other factors that determine traffic fatality rate:
· “Culture” around drinking and driving
o(i) arguably are a determinant of traffic deaths; and
o(ii) potentially are correlated with the beer tax.
· Density of cars on the road
oHigh traffic density means more traffic deaths
o(Western) states with lower traffic density have lower
alcohol taxes
Panel data lets us eliminate omitted variable bias when the
omitted variables are constant over time within a given state.
12/41
Panel Data with Two Time Periods
(SW Section 10.2)
Consider the panel data model,
FatalityRateit = b0 + b1BeerTaxit + b2Zi + uit
Zi is a factor that does not change over time (traffic density),
at least during the years on which we have data.
· Suppose Zi is not observed.
· The effect of Zi can be eliminated using data from T = 2
years.
13/41
The key idea:
Any change in the fatality rate from 1982 to 1988 cannot
be caused by Zi, because Zi (by assumption) does not
change between 1982 and 1988.
The math:
FatalityRatei1988 = b0 + b1BeerTaxi1988 + b2Zi + ui1988
FatalityRatei1982 = b0 + b1BeerTaxi1982 + b2Zi + ui1982
Suppose E(uit|BeerTaxit, Zi) = 0.
Subtracting 1988 – 1982, eliminates the effect of Zi…
14/41
FatalityRatei1988 – FatalityRatei1982 =
b1(BeerTaxi1988 – BeerTaxi1982) + (ui1988 – ui1982)
· The new error term, (ui1988 – ui1982), is uncorrelated with
either BeerTaxi1988 or BeerTaxi1982.
· This differences regression doesn’t have an intercept – it
was eliminated by the subtraction step
· This difference equation can be estimated by OLS, even
though Zi isn’t observed.
15/41
Example: Traffic deaths and beer taxes
1982 data:
ܨܽݐ݈ܽଓݐݕܴܽݐ݁ = 2.01 + 0.15BeerTax (n = 48)
(.15) (.13)
1988 data:
ܨܽݐ݈ܽଓݐݕܴܽݐ݁ = 1.86 + 0.44BeerTax (n = 48)
(.11) (.13)
Difference regression (n = 48)
ܨܴଵଽ଼଼ − ܨܴଵଽ଼ଶ = –.072 – 1.04(BeerTax1988–BeerTax1982)
(.065) (.36)
An intercept is included in this differences regression allows
for the mean change in FR to be nonzero – more on this later…
16/41
DFatalityRate v. DBeerTax:
17/41
Fixed Effects Regression
(SW Section 10.3)
What if you have more than 2 time periods (T > 2)? We want
to use available information.
Yit = b0 + b1Xit + b2Zi + uit, i =1,…,n, T = 1,…,T
Suppose we have n = 3 states: California, Texas, and
Massachusetts.
Population regression for California (that is, i = CA):
YCA,t = b0 + b1XCA,t + b2ZCA + uCA,t
= (b0 + b2ZCA) + b1XCA,t + uCA,t
or
18/41
YCA,t = aCA + b1XCA,t + uCA,t
· aCA = b0 + b2ZCA doesn’t change over time
· aCA is the intercept for CA, and b1 is the slope
· The intercept is unique to CA, but the slope is the same in
all the states: parallel regression lines.
Collecting the lines for all three states:
YCA,t = aCA + b1XCA,t + uCA,t
YTX,t = aTX + b1XTX,t + uTX,t
YMA,t = aMA + b1XMA,t + uMA,t
or
Yit = ai + b1Xit + uit, i = CA, TX, MA, T = 1,…,T
19/41
Can we estimate this model?
The regression lines for each state in a picture
Y = aCA + b1X
Y = aTX + b1X
Y = aMA+ b1X
aMA
aTX
aCA
Y
X
MA
TX
CA
20/41
Recall that shifts in the intercept can be represented using
binary regressors…
In binary regressor form:
Yit = b0 + gCADCAi + gTXDTXi + b1Xit + uit
· DCAi = 1 if state is CA, = 0 otherwise
· DTXt = 1 if state is TX, = 0 otherwise
· leave out DMAi (why?)
· Write a’s in terms of b0 and g’s.
21/41
Summary: Two ways to write the fixed effects model
1. “n-1 binary regressor” form
Yit = b0 + b1Xit + g2D2i + … + gnDni + uit
where D2i =
1 for =2 (state #2)
0 otherwise
iì
í
î
, etc.
2. “Fixed effects” form:
Yit = b1Xit + ai + uit
· ai is called a “state fixed effect” or “state effect”
22/41
Fixed Effects Regression: Estimation
Three estimation methods:
1. “Changes” specification, without an intercept (only works
for T = 2)
2. “n-1 binary regressors” OLS regression
3. “Entity-demeaned” OLS regression (skip)
· These three methods produce identical estimates of the
regression coefficients, and identical standard errors.
· We already did the “changes” specification (1988 minus
1982) – but this only works for T = 2 years
· Methods #2 and #3 work for general T
· Inference (hypothesis tests, confidence intervals) is as usual
(using cluster-robust standard errors)
23/41
Example: Traffic deaths and beer taxes in R
Note: the first state AL is omitted to avoid perfect multicolinearity
24/41
Example, ctd. For n = 48, T = 7:
ܨܽݐ݈ܽଓݐݕܴܽݐ݁ = –.66BeerTax + State fixed effects
(.31)
· Compare slope, standard error to the estimate for the 1988
v. 1982 “changes” specification (T = 2, n = 48) (note that
this includes an intercept – return to this below):
ܨܴଵଽ଼଼ − ܨܴଵଽ଼ଶ = –.072 – 1.04(BeerTax1988–BeerTax1982)
(.065) (.36)
25/41
Regression with Time Fixed Effects
(SW Section 10.4)
An omitted variable might vary over time but not across
states:
· Safer cars (air bags, etc.); changes in national laws
· These produce intercepts that change over time
· Let St denote the combined effect of variables which
changes over time but not across states (“safer cars”).
· The resulting population regression model is:
Yit = b0 + b1Xit + b2Zi + b3St + uit
26/41
Time fixed effects only
Yit = b0 + b1Xit + b3St + uit
This model can be recast as having an intercept that varies
from one year to the next:
Yi,1982 = b0 + b1Xi,1982 + b3S1982 + ui,1982
= (b0 + b3S1982) + b1Xi,1982 + ui,1982
= l1982 + b1Xi,1982 + ui,1982,
where l1982 = b0 + b3S1982 . And so on..
27/41
Two formulations of regression with time fixed effects
1. “T-1 binary regressor” formulation:
Yit = b0 + b1Xit + d2B2t + … dTBTt + uit
where B2t =
1 when =2 (year #2)
0 otherwise
tì
í
î
, etc.
2. “Time effects” formulation:
Yit = b1Xit + lt + uit
28/41
Time fixed effects: estimation methods
1. “T-1 binary regressor” OLS regression
Yit = b0 + b1Xit + d2B2it + … dTBTit + uit
· Create binary variables B2,…,BT
· B2 = 1 if t = year #2, = 0 otherwise
· Regress Y on X, B2,…,BT using OLS
· Where’s B1?
2. “Year-demeaned” OLS regression (skip)
· Deviate Yit, Xit from year (not state) averages
· Estimate by OLS using “year-demeaned” data
29/41
Estimation with both entity and time fixed effects
Yit = b1Xit + ai + lt + uit
· When T = 2, computing the first difference and including
an intercept is equivalent to (gives exactly the same
regression as) including entity and time fixed effects.
· When T > 2, there are various equivalent ways to
incorporate both entity and time fixed effects:
oT – 1 time indicators & n – 1 entity indicators
oentity demeaning & T – 1 time indicators (R code,
plm, skip)
o time demeaning & n – 1 entity indicators (R code, plm,
skip)
oentity & time demeaning (R code, plm, skip)
30/41
Benchmark group: stateAL in 1982
Are the time effects jointly statistically significant?
31/41
.
32/41
Application: Drunk Driving Laws and Traffic Deaths
(SW Section 10.6)
Some facts
· Approx. 40,000 traffic fatalities annually in the U.S.
· 1/3 of traffic fatalities involve a drunk driver
· 25% of drivers on the road between 1am and 3am have been
drinking (estimate)
· A drunk driver is 13 times as likely to cause a fatal crash as
a non-drinking driver (estimate)
33/41
Drunk driving laws and traffic deaths, ctd.
Public policy issues
· Drunk driving causes massive externalities (sober drivers
are killed, society bears medical costs, etc. etc.) – there is
ample justification for governmental intervention
· Are there any effective ways to reduce drunk driving? If so,
what?
· What are effects of specific laws:
omandatory punishment
ominimum legal drinking age
oeconomic interventions (alcohol taxes)ß
34/41
The drunk driving panel data set
n = 48 U.S. states, T = 7 years (1982,…,1988) (balanced)
Variables
· Traffic fatality rate (deaths per 10,000 residents)
· Tax on a case of beer (Beertax)
· Minimum legal drinking age
· Minimum sentencing laws for first DWI violation:
oMandatory Jail
oMandatory Community Service
ootherwise, sentence will just be a monetary fine
· Vehicle miles per driver (US DOT)
· State economic data (real per capita income, etc.)
35/41
Why might panel data help?
· Potential OV bias from variables that vary across states but
are constant over time:
oculture of drinking and driving
oquality of roads
ovintage of autos on the road
Þ use state fixed effects
· Potential OV bias from variables that vary over time but are
constant across states:
o improvements in auto safety over time
ochanging national attitudes towards drunk driving
Þ use time fixed effects
36/41
37/41
38/41
Empirical Analysis: Main Results
· Sign of the beer tax coefficient changes when fixed state
effects are included
· Time effects are statistically significant but including them
doesn’t have a big impact on the estimated coefficients
· Estimated effect of beer tax drops when other laws are
included.
· The only policy variable that seems to have an impact is the
tax on beer – not minimum drinking age, not mandatory
sentencing, etc. – however the beer tax is not significant
even at the 10% level using clustered SEs in the
specifications which control for state economic conditions
(unemployment rate, personal income)
39/41
Empirical results, ctd.
· In particular, the minimum legal drinking age has a small
coefficient which is imprecisely estimated – reducing the
MLDA doesn’t seem to have much effect on overall driving
fatalities.
40/41
Summary: Regression with Panel Data
(SW Section 10.7)
Advantages and limitations of fixed effects regression
Advantages
· You can control for unobserved variables that:
ovary across states but not over time, and/or
ovary over time but not across states
· More observations give you more information
· Estimation involves relatively straightforward extensions
of multiple regression
41/41
· Fixed effects regression can be done three ways:
1. “Changes” method when T = 2
2. “n-1 binary regressors” method when n is small
3. “Entity-demeaned” regression
· Similar methods apply to regression with time fixed
effects and to both time and state fixed effects
· Statistical inference: like multiple regression.