stata代写-ECON 6040
ECON 6040
Problem Set 2
1. A researcher is concerned with estimating the effect of the level of unemployment
insurance benefits on the length of unemployment spells. She finds out that recently
US state Blue changed its unemployment insurance programme so that workers with
earnings above a certain threshold (group H) will receive higher benefits if they become
unemployed, whereas for workers below the earnings threshold (group L) unemployment
benefits remain unchanged. The researcher collects information on average unemploy-
ment duration (in weeks) in State Blue and neighbouring State Red for both groups of
workers (H and L), from the year before the policy change and for the year after. [20
State Blue State Red
Before After Before After
Group H 15.8 16.9 15.2 15.4
Group L 17.1 17.6 16.8 17.1
(a) Using the data provided, construct two alternative difference-in-difference esti-
mates of the effect of unemployment benefits on unemployment duration. Discuss
the key assumptions underlying the validity of your estimates in each case. [10]
(b) Using the data provided, construct a difference-in-difference-in-difference estimate
of the effect of unemployment benefits on unemployment duration. Discuss the
assumption underlying the validity of this estimate. [10]
2. Use the data in “WAGEPAN” for this exercise, which is a panel dataset of 545
men who worked every year from 1980 to 1987. Consider the wage equation:
log(wageit) = β0+β1educi+β2blacki+β3hispi+β4experit+β5exper
The variables are described in the dataset. Notice that education does not change
over time. [30 points]
(a) Estimate equation (1) by pooled OLS. Are the usual OLS standard errors reli-
able, even if ci is uncorrelated with all explanatory variables? Explain. Compute
appropriate standard errors. [10]
(b) Estimate equation (1) by Random Effects. Compare your estimates with the
pooled OLS estimates in part (a). [10]
(c) Estimate equation (1) by Fixed Effects. Compare your estimates with the RE
estimates in part (b). [10]
3. A researcher wants to estimate the returns of medical care on the mortality rates
of newborns. Her identification strategy is to exploit a salient birth weight cutoff below
which newborns may be at increased consideration for receiving additional treatments,
using a Regression Discontinuity Design (RDD). More specifically, she focuses on the
“very low birth weight” classification at 1,500 g, below which hospitals either through
hospital protocols or as rules of thumb assign newborns to receive a bundle of mortality-
reducing health treatments. She collects data on birth weight, one-year mortality and
various newborn characteristics (mother’s age and education, father’s age and education,
the newborn’s sex, gestational age, race) for a large sample of newborns. [20 points]
(a) Is this a sharp or fuzzy RDD design? What is the difference between the two
types of regression discontinuity design? [7]
(b) Explain how the researcher could implement RDD and what her estimate would
measure. Illustrate your approach in a regression model. [8]
(c) Discuss what diagnostic and robustness tests could be implemented to test the
validity of the RDD procedure. [5]
4. Consider the paper by Di Tella and Schargrodsky (2004): “Do police reduce
crime? Estimates using the allocation of police forces after a terrorist attack.” in the
reading list. [30 points]
(a) What is the research question that the authors analyse in this paper? What are
the challenges associated with addressing this question empirically? [6]
(b) Explain how the authors use a terrorist attack on the Jewish centre in Buenos
Aires in 1994 as a source of exogenous variation in the geographical allocation of
police forces. [4]
(c) Using the data in “POLICE” calculate the average monthly number of car thefts
in blocks where there is a protected Jewish institution in the period before the
attack. Do the same for the period after. Now do the same (both before and
after) for blocks where there is no protected Jewish institution. What do you
learn about the impact of police presence on crime? [6]
(d) Estimate the following specification:
CarTheftit = β0SameBlockPoliceit +Mt + Fi + uit, (2)
where CarTheftit is the number of car thefts in block i for month t; Same −
BlockPoliceit is a dummy variable (that you can create) that equals 1 for the
months after the terrorist attack (August, September, October, November, and
December) if there is a protected institution in the block, 0 otherwise; Mt is a
month fixed effect; Fi is a block fixed effect. [7]
(e) Add to the specification in part (d) a dummy variable that takes the value 1 after
the terrorist attack (August, September, October, November, and December) if
the block is one block away from the nearest protected Jewish institution, and
0 otherwise (you can create this variable). Estimate the augmented specification
and comment on the estimates you obtain. What is the interpretation of β0 in
this specification [hint: think about whether the control group changes]? [7]