ECON7310-无代写
时间:2023-10-19
ECON7310: Elements of Econometrics
Research Project 2
Ruby Nguyen
October 7, 2023
Instruction
Please answer all questions following a format similar to the answers to your tutorial questions.
When you use R to conduct empirical analysis, you should show your R script(s) and outputs
(e.g., screenshots of commands, tables, and figures). You will lose 2 points whenever you fail
to provide R commands and outputs. Please clearly label all your answers and keep your
response brief and concise. You should upload your research report (in PDF or Word format)
via the Turnitin submission link (in the “Research Project 2” folder under “Assessment”) by
11:59 AM on the due date October 23, 2023. You are allowed to work on this assignment
in groups; however, you must answer all the questions in your own words and submit your
report separately. The marking system will check the similarity, and UQ’s student integrity
and misconduct policies on plagiarism will apply.
Panel Data and Differences-in-Differences
DiTella and Schargrodsky (2004) investigated the impact of police presence on car theft.
Rational crime models suggest that a visible police force can deter crime, but measuring
this effect is challenging due to the simultaneous causality of crime and police presence. To
address this, the authors used the police response to a terrorist attack in July 1994 in Buenos
Aires, Argentina as an exogenous variation. Following the attack, the government provided
police protection to all Jewish and Muslim buildings across the country, which the authors
hypothesized would also deter other street crimes, such as car theft. They collected data on
1
car thefts in Buenos Aires neighborhoods from April to December 1994, covering 876 city
blocks. They believed this treatment was exogenous to auto theft, with the deterrence effect
strongest in blocks with Jewish institutions. Their sample included 37 blocks with Jewish
institutions (the treatment group) and 839 blocks without an institution (the control group).
For Questions 1 and 2 below, please use the DS2004.csv dataset and exclude observa-
tions for July. Data description and variable definitions can be found in the document
DS2004 description.pdf.
1. (20 points) To examine the deterrence effect of police presence, we can use the
differences-in-differences (DID) approach and estimate the following regression model:
theftsit = β0 + β1Xit + β2sameblocki + β3post-attackt + uit (1)
where the subscripts i and t label city blocks and months, respectively; post-attackt
is a binary variable indicating months in the data after the terrorist attack (i.e.,
post-attackt = 1 if month ≥ 8, and 0 otherwise); andXit = sameblocki×post-attackt.
(a) (9 points) Estimate (1) using OLS and compute the cluster-robust standard error.
Report the DID estimator βˆDID1 . Is the coefficient statistically significant? Does it
have the expected sign?
(b) (5 points) Compute the change in the treatment group and the change in the
control group. Is βˆDID1 primarily due to a change in the treatment group or the
control group? Is this what you expected?
(c) (6 points) Do the results change if you just use heteroskedasticity-robust standard
errors? Explain your answer by comparing the two sets of results. It is better to
use the cluster-robust standard errors?
2. (20 points) We can also add entity (block) fixed effects and time (month) fixed effects
and estimate the following fixed effects regression model:
theftsit = β1Xit + αi + λt + uit (2)
(a) (8 points) Estimate β1 in (2) and compute the cluster-robust standard error. Is
the coefficient statistically significant?
2
(b) (8 points) Suppose you allow for entity (block) fixed effects only; that is, you
estimate the following fixed effects model:
theftsit = β1Xit + αi + uit (3)
How do your answers to (a) change? What are the omitted variables that αi is
likely to control for?
(c) (4 points) What are the omitted variables that λt in (2) is likely to control for?
Are the time (month) fixed effects in (2) statistically significant?
Binary Choice Models
3. (30 points) You want to analyze the purchase of private insurance using the HRS05.csv
dataset. This dataset was derived from wave 5 (2002) of the Health and Retirement
Study (HRS).
ins is a binary variable that equals 1 if the person purchased private insurance and
0 if they did not. Explanatory variables include age, hstatusg (=1 if health status
is good, very good, or excellent, =0 otherwise), hisp (=1 if Hispanic), married (=1
if married), retire (=1 if retired), educyear (years of education), and hhincome
(household income).
(a) (3 points) Estimate the probability of purchasing private insurance for retirees and
those who are not yet retired.
(b) You regress ins on age, age2, hstatusg, hisp, married, retire, educyear, and
hhincome. Run this regression using a linear probability model (LPM) and report
regression results. (3 points)
i. (3 points) Do you think that heteroskedasticity-robust standard errors should
be used? Explain your answer.
ii. (2 points) Test the hypothesis that the probability of purchasing private
insurance does not depend on the age of individuals.
iii. (6 points) David is a 65-year-old retired, married, non-Hispanic individual
with a college degree (educyear = 16), good health status, and an average
household income in the sample. Predict the probability that David purchased
3
private insurance. Calculate the change in the predicted probability if his
household income was one standard deviation below the sample average. Would
the change in probability be the same if his household income was one standard
deviation above average? Explain the reason.
(c) (13 points) Repeat (b) using a probit model.
Instrumental Variables Regression
4. (30 points) Use the following regression model and the cigbwght.csv dataset to
estimate the effects of several variables, including cigarette smoking, on the weight of
newborns:
logbwght = β0 + β1male+ β2parity+ β3logfaminc+ β4smoke+ u (4)
where logbwght is the logarithm of birth weight; male is a dummy variable that is
equal to 1 if the child is male; parity is the birth order of this child; logfaminc is the
logarithm of family income; and smoke is a dummy variable that is equal to 1 if the
mother smoked during pregnancy.
(a) (7 points) Estimate regression equation (4) using OLS and report regression results.
Interpret the estimated coefficient on smoke and report a 95% confidence interval
for the effect of smoking. Predict the change in birth weight when family income
falls by 10%.
(b) (5 points) What could be the problem in using OLS to estimate the effect of
smoking in model (4)? Explain your answer. Is this a threat to the internal or
external validity of your regression analysis?
(c) (10 points) Suppose you are concerned that smoke may be endogenous. You
have data on the average price of cigarettes in each woman’s state of residence
(cigprice). You now run the regression in (4) using cigprice as an instrumental
variable (IV) for smoke. Use this IV regression to answer questions in (a).
(d) (8 points) Is cigprice a weak instrument? Is it possible to test the exogeneity of
this instrument? Comment on the validity of this instrument.
essay、essay代写