程序代写案例-EFIMM0120
时间:2022-05-03
EFIMM0120
Applied Quantitative
Research in Accounting
and Finance
Lecture 5: Difference-in-Difference Design
and Quasi-Natural Experiments
Part 1: Endogeneity concerns in OLS regression
Dr. Zilu Shan
zilu.shan@bristol.ac.uk
Endogeneity Problem in Causal Inference
§ Causal inference is the leveraging of theory and deep knowledge
of institutional details to estimate the impact of events and choices
on a given outcome of interest.
§ In traditional OLS regression, there are three main source of bias
–Omitted Variable
–Simultaneously
–Measurement error
Endogeneity Source (1)
Omitted variables
§ The possibility that an explanatory variable modeled as exogenous
will in fact be endogenous because of omitted variable
Assume that the structural equation is given by! = #! + #"%" + ##&" + '
§ The OLS method provides the Best Linear Unbiased Estimate
(BLUE) if all explanatory variables in this regression is exogenous
§ The bias is equal to
E[ (#"] − #" = ## $%&((!)*!),-.((!)
We are only concerned with the omitted variable that impact both
explanatory and explained variables
Endogeneity Source (1)
Omitted variables –example
§ Omitted variable issue are particularly severe in corporate finance,
because the objects of study (Firms or CEOs, for example) are
heterogeneous along many different dimensions
§ Some examples:
–Executive compensation executives’ abilities
–Corporate financial and investment politics financing frictions (i.e.
information asymmetry and incentive conflicts)
–Corporate decisions both public and unpublic information
Endogeneity Source (2)
Simultaneity
Assume a two-equation structural model:!! = #" + #!%! + ##&! + '!%! = (" + (!!! + (#&! + '#
If written in reduced form%! = $!%$"&!!'$"&" + $"&#!'$"&" &! + $#!'$"&" &# + $"!'$"&" '! + !!'$"&" '#
If (! ≠ 0, then there exists endogeneity due to simultaneity
§ Simultaneity arises when one or more explanatory variables are jointly
determined with the explained variable in an equilibrium.
§ If simultaneity exists, the structural error term is correlated with the explanatory
variable
§ The causal relationship between an explained and an explanatory variable runs
both ways
Endogeneity Source (2)
Simultaneity - example
§ In a regression of a value multiple (such as market-to-book) on an
index of anti-takeover provisions, the usual result is a negative
coefficient on the index.
§ However, it doesn’t mean that the presence of anti-takeover
provisions leads to a loss of firm value.
§ Alternative explanation: Mangers of low value firms adopt anti-
takeover provisions in order to entrench themselves.
Endogeneity Source (3)
Measurement errors
§ Measurement error comes from any discrepancy between the true
variables and the proxy for unobservable or difficult to quantify
variables.
§ It could happen for either dependent variables or independent
variables.
§ Some examples:
–Market value of debt: Most debt is privately held by banks and other
financial institutions, so there is no observable market value
–Executive compensation: Stock options often vest over time and valued
using an approximation, such as Black-Scholes
–Corporate governance: it’s a nebulous concept with a variety of different
facets. Current use of anti-takeover provision index or the presence of large
blockholders are unlikely to be sufficient
EFIMM0120
Applied Quantitative
Research in Accounting
and Finance
Lecture 5: Difference-in-Difference Design
and Quasi-Natural Experiments
Part 2: An introduction to DID method
Dr. Zilu Shan
zilu.shan@bristol.ac.uk
ControlledVs. (Quasi-) Natural
Experiments (1)
Controlled experiments are often used in other area of science
§ i.e. check if certain drug reduces cholesterol, researchers could
randomly assign patients to treatment group (certain drug) and
control group (placebo pills)
§ The difference in the average change in cholesterol between the
two group of patients is average treatment effect (ATE), or in
regression format, difference-in-difference estimator.
§ It’s very rare in social science that researcher can apply controlled
experiments, because of the difficulty of imposing treatment.
ControlledVs. (Quasi-) Natural
Experiments (2)
Natural experiments is a sharp change in one or more variables of
interest that occurs for exogenous reasons.
§ It could either
– by natural causes (e.g. natural disasters),
– or by some kind of human action, such as changes in regulation, economic
policy and political changes (generally referred to as “quasi-natural”
experiments)
§ Key assumption to allow causal inference
– the treatment assignment is random or at least “as good as random”
– In other words, any other variable that is important to determine the
outcome variable is uncorrelated with the treatment assignment
§ In literature, it often be called “the shock”
Causal inference issue
§ We ideally would like the average treatment effect (ATE)* !" − * !!
§ But we only observe an estimate of the naïve estimator:* !"|, = 1 − * !!|, = 0
§ If we assume * !"|, = 1 = * !!|, = 1 then we can get the ATT
(average treatment on the treated) from observed data* !"|, = 1 − * !!|, = 0
Single difference in cross-section
Comparing the post-treatment values of the outcome variable between
treated and untreated firms!! = #" + ##%! + &!
§ !! is the outcome variable
§ %! is a dummy variable indicates whether firm ' is in treatment group
§ If treatment is random, then %! is uncorrelated with error term &!
§ (## is therefore an unbiased estimate of the average treatment effect(ATE)
§ The approach is useful when the researcher does not have data on
the values of outcome ! previously to the treatment
§ The potential problem:
the average ! of treated and untreated firms were different ex-ante (i.e.
before the treatment)
Time-difference regressions
Comparing post-treatment values to pre-treatment values for all firms!/,1 = #! + #"P1 + '/,1
§ !/,1 is the outcome variable
§ 01 is a dummy variable which takes value 1 for the observations in
the post-shock period, and 0 for the observations in the pre-shock
period
§ No other event that affect the outcome !/,1 occurred between the
pre-shock and post-shock period.
§ In other words, there is no omitted variable, correlated to 01 , that
affects !/,1
Difference-in-Difference
§ Difference in difference model combines the cross-sectional and
the time-series differences model into a single model
§ Intuition: compute the difference of
– the change in outcome ! pre- versus post-treatment for the treated group
and
– the change in outcome ! pre- versus post-treatment for the control group
§ We would need a panel of treated and untreated firms, with
observations before the shock and after it
§ A typical DID regression would be!",$ = #% +#&P$ +#'&" +#(P$ ∗ &" +(",$
DiD coefficient
!",$ = #% +#&P$ +#'&" +#(P$ ∗ &" +(",$
§ (#" captures the average change in !/,1 from the pre- to post- shock
periods for the untreated group (,/ = 0)
§ (## captures the pre-shock difference in !/,1 between treated and
untreated firms (01 = 0)
§ Our main coefficient (#2 (DiD coefficient) captures the effect of the
shock, that is the average differential change in !/,1 from the pre- to
post- treatment period for the treatment group relative to control
group
Further comments on Diff-in-Diff
§ We can add individual level covariates, including fixed effects, but they
really aid only for getting more precise estimate.
§ Diff-in-diff logic with ' treated and ) not treated (control group), taking
expectations to eliminate the * which we assume orthogonal to
treatment+ !$%## |%$%# = 1 − + !"|%$ = 1 + + !$"|%$ = 0 - + !$%#" |%$%# = 0
= 0 + 1 + 2$%# − 0 + 2& +(0+ 2&) − (0 +2$%#)
=1
§ If we assume + !"|% = 0 = + !"|% = 1 then we can get the ATT
(average treatment on the treated) from observed data+ !#|% = 1 − + !"|% = 1
Graphic illustration – DiD without trends
§ Pre- and post- shock average ! of
treated and untreated observations
are constant (no trend)
§ +#! captures the difference betweenaverage pre- and average post-
shock outcome for untreated
observations
§ +## captures the pre-shock differencein outcome between treated and
untreated firms
§ +#( (DiD coefficient) capturesdifference between observed
average post-shock ! and the
average unobserved counterfactual !
after the shock (i.e. The hypothetical
value of ! of treated observation
absent the shock)
Graphic illustration – DiD with trends
§ Pre- and post- shock average ! of
treated and untreated observations
increase at a constant trend
§ +#! captures the difference betweenaverage pre- and average post-
shock outcome for untreated
observations
§ +## captures the pre-shock differencein outcome between treated and
untreated firms
§ +#( (DiD coefficient) capturesdifference between observed
average post-shock ! and the
average unobserved counterfactual !
after the shock (i.e. The hypothetical
value of ! of treated observation
absent the shock)
Graphic illustration – combined figure
Parallel trending assumption
§ In truly random experiment (controlled experiment), the treatment
and control group should be indistinguishable
§ Parallel trending assumption:
– In order to meet the statistical assumption of causal inference with Did,
even if treated and control firms present different averages for ! prior to the
shock (thus not indistinguishable), their trends should be parallel
– In the regression format, no correlation between the )$ ∗ &" and the error
term
Defining the pre- and post- period
§ If using too few periods (one or two) prior to the shock, it doesn’t allow
checking for parallel trends assumption
§ If using too long periods for pre- or post- shock, the analysis can be
subject to the occurrence of other events that also effects the outcome,
causing confounding issue
§ In some cases, firms may be expected to respond to the shock
gradually over time
§ If the data is not annual and the outcome presents a seasonal pattern,
one should use the same calendar month or quarters in the pre- and
post- period to avoid potential seasonality problems.
§ If the data of shock cannot be precisely defined (i.e. the pass date and
effective date of law are usually different), researchers should consider
exclude one or more period to avoid wrongly assigning a period to
the pre- or post- shock
Other DiD method issues
§ Control variables
– The inclusion of other control variables that affect 3 may be useful to increase the fit
of the regression and the precision of estimation by reducing their standard error
– Avoid including control variables that are themselves affected by the treatment
§ Anticipatory effects
– Trend could be affected by anticipation of treatment
§ SUTVA(Stable Unit Treatment Value Assumption) not holding:
– treated could be influence the outcomes of the control group
§ Change in composition
– Control group may change over time - suggests being explicit about its composition
§ Standard errors:
– Clustered. Otherwise, the analysis may be underestimating them.
EFIMM0120
Applied Quantitative
Research for Accounting
and Finance
Lecture 5: Difference-in-Difference Design
and Quasi-Natural Experiments
Part 3: DID – an empirical application
Dr. Zilu Shan
zilu.shan@bristol.ac.uk
Bank risk and bailout probability
Research question:
Will the governmental guarantees to banks affect their risk?
Theoretical framework (1)
The expected effect of governmental protection on bank risk is
ambiguous
§ Charter value theory: guarantees cause protected banks to take
less risk than unprotected banks, because this implicit protection
allows them to fund at abnormally low costs, which is a source of
value (charter value) that banks do not want to put at risk
§ Moral hazard hypothesis: the depositors of protected banks have
less incentives to monitor the risk of the bank, thereby leading an
increase in risk-taking by the banks
The empirical evidence is also mixed.
Theoretical framework (2)
§ Systematically important banks benefit from a perception of
protection by depositors and investors in general
§ This perception became more pronounced during financial crisis
§ Therefore, systematically important banks enjoy implicit guarantees
than other smaller groups do not, even in countries whose financial
systems were not directly affected by the crisis.
The shock: the 2008 financial crisis
§ Timeline of 2008 financial crisis
– The liquidity crisis started in the US in the second half of 2017
– In September of 2008, the failure of Lehman Brothers signifies that the crisis actually
spread internationally
– In the US, government backs to systemically important financial institutions, restoring
their deposits
– October 2008, G-7 action plan was launched, causing the perception of an increased
bailout probably of large banks that went beyond the boarders of G-7 countries
§ Use a sample of banks from countries whose financial systems were
not directly affected by the crisis
– the crisis changed the perception of governmental protection to large banks
(treatment group)
– The perception of a bailout probability didn’t change much for less protected banks
(control group)
Identification strategy:DiD model
To test whether governmental protection affects bank risk, we estimate the
following DiD model *+ ,_./012",$ =#% + #&314545 + #')1062/627",$ + #( 314545 ∗ )1062/627 ",$ + (",$
§ ,_./012",$ is a measure of bank risk, defined by the bank’s distance todefault in terms of the standard deviation of ROA
§ Time period for main regression 2005 - 2010
– Pre-crisis period : 2005 to 2007 (456767 =0),
– Post-crisis period: 2008 to 2010 (456767=1)
§ Treatment: the probability of expected external support
§ Main coefficient of interest: 8#(
– If positive, the result favors the charter value hypothesis
– If negative, the result favors the moral hazard hypothesis
Endogenous shock concern
§ When choosing the treatment (shock), it’s important to consider
whether the shock is exogenous
§ One concern here is that the protected and unprotected banks
might be differently exposed to toxic assets
§ The different exposure could be treated as an omitted variable
– that explains bank risk(outcome variable)
– and is correlated to our main regressor (protected bank)
Solution:
§ We use a sample of banks located in OECD countries whose banks
were not directly affected by the crisis

Descriptive statistics
§ Protected and unprotected banks have roughly the same level of risk(L9(Z −sccore)) prior to the crisis on average
§ Both protected and unprotected banks increase their risk during crisis period, but
the average of protected banks increase more
§ Prior to the crisis, unprotected had a slightly larger ratio of liquid asset to short-
term liabilities compared to protected banks on average, but the difference is
statistically insignificant (Liquidity)
§ Protected banks are significantly larger (Asset)
Parallel trends
§ The average &_345678 of
protected (treatment) and
unprotected (control) banks
§ In pre-crisis years, &_345678
are roughly constant (parallel
trend assumption)
§ In the first year of crisis, we
observe a dramatic decrease
in &_345678 (an increase in
bank risk), which is more
pronounced for protected
banks than for unprotected
banks
DiD Results
D
iD
re
su
lt
s

b
as
el
in
e
re
g
re
ss
io
n
§
Th
e
es
tim
at
ed
A !that
th
e
Z-
Sc
or
e
de
cr
ea
se
d
by
36
.8
%
on
av
er
ag
e
fo
rt
he
co
nt
ro
lg
ro
up
(u
np
ro
te
ct
ed
ba
nk
s)
fro
m
pr
e-
cr
is
is
pe
rio
d
to
cr
is
is
pe
rio
d
§
Th
e
es
tim
at
ed
A #isn
ot
st
at
is
tic
al
ly
si
gn
ifi
ca
nt
,i
nd
ic
at
es
th
at
th
e
av
er
ag
e
pr
e-
cr
is
is
ris
k
le
ve
lb
et
w
ee
n
tre
at
ed
an
d
co
nt
ro
lg
ro
up
s
ar
e
no
ts
ta
tis
tic
al
ly
di
ffe
re
nt
§
Th
e
es
tim
at
ed
A ((ma
in
co
ef
fic
ie
nt
of
in
te
re
st
)i
nd
ic
at
es
th
at
th
e
av
er
ag
e
Z-
Sc
or
e
of
th
e
pr
ot
ec
te
d
ba
nk
s
de
cr
ea
se
s
by
31
%
m
or
e
th
an
th
at
of
th
ei
r
un
pr
ot
ec
te
d
co
un
te
rp
ar
ts
do
es
DiD results – with additional controls
§ Column (2) add bank-level controls: size
and liquidity
– Result show that bank size and liquidity
do not materially affect the bank risk
– The coefficients interpretation doesn’t
change
§ Column (3) add bank level controls and
country-level controls
– To address the possible concern that
country features might be important in
determining bank risk
– Loss some observation because of
missing data
– None of the country-level coefficient is
significant
– The coefficients interpretation doesn’t
change
DiD Result - with fixed effects
§ Column (4) replace country-level
controls by country fixed
effects
– It might capture other time-invariant
country features that are not captured
by macroeconomic controls (i.e.
market microstructure, quality of
regulation, enforcement of the law)
– The sign and statistical significance
of coefficient remain unchanged
although the magnitude changes
§ Column (5) add bank fixed effect
– It’s perfectly collinear with the
protected dummy, thus A# is no
longer identified
§ Column (6) add year fixed effect
Falsification and robustness test
§ Check for the treatment reversals
– If the treatment is subsequently reversed, we should expect the opposite effect of
what was observed with treatment
– In this case, we could check the bank risk after 2011, when the financial crisis is
supposed over
§ Check the non-parametric version of DiD,- &_/0123),+ =#" + #!425656 + ##721830839),+ +:, ;( 425656 ∗ 721830839 ),+ + '),+
– '$ captures the difference in outcome in each year
– It’s useful when the treatment is not a sharp event
§ Restrict the sample to banks that have observations both before and
after the crisis
§ Use a placebo timing of treatment
essay、essay代写