ECMT6002代写-ECMT6002/6702
时间:2022-11-14
ECMT6002/6702 Econometric Applications
This test is worth 50 marks in total. Please answer all seven (7) questions.
Note: When performing statistical tests, always state the null and alternative hypotheses,
the test statistic and its distribution under the null hypothesis, the rejection rule, level of
significance and the conclusion of the test.
Question 1. [5 marks]
(i) What are the distinctive features of a pure time series process compared to other data struc-
tures? What properties of a pure time series require special attention when analysed using
regression methods? Briefly explain your answer. [1 mark]
(ii) What conditions must a time series process satisfy for it to be weakly dependent? What are
the consequences of estimating a time series model using OLS with data that are not weakly
dependent? [1 mark]
(iii) An real estate investor was conducting an analysis of the effect of house characteristics on the
sale price, based on the regression model:
log (price) = β0 + β1 log (lotsize) + β2 log (sqrmtr) + β3 bdrms+ β4 bthrms+ u (1)
where price is the sale price (measured in $1000), lotsize is land area (square metres), sqrmtr
is the floor area of the house (also measured in square metres), bdrms is the number of bed-
rooms and bthrms is the number of bathrooms. The analyst obtained a dataset that included
the variable assess. This is the market value of the house as assessed by an experienced real
estate valuer prior to sale. Adding assess to the regression model led to an increases in the
R¯2 from 0.430 to 0.916. Should the analyst include the variable assess in the regression model
for studying the effect of house characteristics on market prices? Explain your reasoning. [1
mark]
(iv) An alternative specification of the regression model for house prices is:
price = β0 + β1 lotsize+ β2 sqrmtr + β3 bdrms+ β4 bthrms+ u (2)
where price, lotsize, and sqrmtr are in level form. Outline a procedure for testing whether
model (1) or model (2) is a better specification. What are the limitations of the test? [2 marks]
1

,

Question 2. [6 marks]
The NSW Department of Health has hired you to determine if there is a relationship between a
person’s blood pressure and the number of cigarettes they smoke. You specify the following linear
model for blood pressure (bp):
bp = β0 + β1 age+ β2weight+ β3male+ β4 cigs+ u (3)
where age is the person’s age (in years), weight is the person’s weight (in kilograms), male is an
indicator variable equal to one if the person identifies as male and zero otherwise, and cigs is the
number of cigarettes smoked each year.
Instead of having the actual number of cigarettes a person smokes each year (cigs), you are given
a measure of cigarette consumption based on a survey which asks people to recall the number
of cigarettes they smoked in the previous year (c˜igs). You are concerned this measure contains
measurement error such that:
c˜igs = cigs+ v
where v is measurement error. Given this, you estimate the following model:
bp = β0 + β1 age+ β2weight+ β3male+ β4 c˜igs+ e (4)
(i) Solve for the new error term e in model (4). [2 marks]
(ii) Suppose you make the classical errors-in-variables assumption (i.e. the observed variable is
correlated with the measurement error). What does this imply about the relationship between
the measurement error and cigarette consumption? [1 mark]
(iii) Do you think it is possible that this classical errors-in-variables assumption will hold in this
case? Why or why not? [1 mark]
(iv) What implications does the measurement error in the number of cigarettes smoked by people
have for your estimates of β2? What are the implications for your estimate of β1? [2 marks]
2
Question 3. [10 marks]
There is a large literature in economics devoted to estimating the causal effect of education on
earnings. The standard model used to study the determinants of log (wage) is:
log (wage) = β0 + β1 educ+ β2 exper + β3 exper2 + u (5)
where educ is years of education, exper is years of labour market experience.
(i) An alternative to OLS is the Instrumental Variable (IV) estimator. A popular instrument for
educ has been compulsory schooling laws (CSL) which are laws setting the minimum age
at which someone may leave school. What conditions must the instrumental variable CSL
satisfy in order for the IV estimator to be unbiased? Explain whether these conditions can be
tested. [2 marks]
(ii) Outline the procedure which involves the application of OLS in two steps that would recover
the IV estimates for model (5). What are the limitations of this two-step procedure? [2 marks]
(iii) Based on a sample of 971 observations from anAustralian survey, the following estimates were
obtained:
Table 1: Estimation Results
log (wage)
Variable OLS IV
educ 0.1367 0.2791
(0.0102) (0.1255)
exper 0.1111 0.2387
(0.0305) (0.1168)
exper2 -0.0029 -0.0053
(0.0014) (0.0026)
constant -0.3437 -3.4030
(0.2765) (2.7024)
R2 0.348 0.0496
From the IV estimates construct the 95% confidence interval for β1. Does the IV confidence
interval include the OLS estimate for β1? [2 marks]
(iv) It would be useful to test whether educ is endogenous in model (5). Outline a test for the
endogeneity of educ in model (5). Briefly explain the main idea underlying this test. [2 marks]
3
(v) Which model is superior, the one based on the OLS estimator or the one based on the IV
estimator? Explain your reasoning. [2 marks]
Question 4. [6 marks]
You are a consultant hired by a private consulting firm to investigate the effect of the NSW gov-
ernment building a new hospital on house prices in the suburb of Lemongrove. Discussions that
a new hospital may be built in Lemongrove began after 2006, and the new hospital was built and
began operating in 2008.
You have been given access to data on the prices of houses sold in Lemongrove in 2006 (the ‘before’
period) and another sample on the price houses that sold in 2010 (the ‘after’ period). The hypothesis
you want to test is that the price of houses located near the site of the new hospital would rise above
the price of more distant houses.
The data for each year includes the dummy variable Near which is equal to one if the house is
located within 2 kilometres of the new hospital and zero otherwise. House prices for both years
of data were measured in 2010 prices. The variable price denotes the real house price (scaled by
$100,000). The following simple regression model was estimated using only the year 2010 sample
of data:
p̂rice = 10.131 + 2.688Near (6)
(0.309) (0.788)
n = 96, R2 = 0.199
While the following was estimated using only the 2006 sample of data:
p̂rice = 9.257 + 1.412Near (7)
(0.265) (0.671)
n = 105, R2 = 0.106
(i) What is the interpretation of the coefficient on Near in model (6)? Is the coefficient statisti-
cally significant at the 5% significance level against the two-sided alternative hypothesis that
it is non-zero? [2 marks]
(ii) Explain why we cannot infer from the estimates in (6) that the location of the hospital caused
the price of houses located nearby to increase? What evidence from model (7) supports this
conclusion? [2 marks]
(iii) Using the information from models (6) and (7), calculate the difference-in-difference estimate
of the impact of the new hospital on the price of nearby houses. [2 marks]
4
Question 5. [7 marks]
The following panel data model was specified to study the determinants of the crime rate across
cities in Australia:
log (crmrte) = β0 + β1 log (income) + β2 log (unemploy) + β3 log (youth) + ai + uit (8)
where log (crmrte) is the natural log of crimes committed per person, log (income) is the natural
log of average family income, log (unemploy) is the natural log of the unemployment rate and
log (youth) is the natural log of the fraction of the city population aged 15–21 years. The dataset
is based on a panel of i = 12 cities followed over t = 30 years. The variable ai is an unobserved
city-specific effect.
(i) Give two (2) examples of the kinds of variables captured by the term ai in model (8). [1 mark]
(ii) What are the advantages of using the Fixed Effects (FE) estimator for this model over the
Random Effects (RE) estimator? Are there any disadvantages to using the FE estimator rather
than the RE estimator? Explain your answer. [2 marks]
(iii) Outline the key idea of the Fixed Effects (FE) transformation underlying the FE estimator. [2
marks]
(iv) An alternative to the Fixed Effects (FE) and Random Effects (RE) estimators for model (8) is
the First-Difference (FD) estimator. Explain the conditions under which the FD estimator is
superior to the FE estimator. Are these conditions likely to be the case with this particular
application? [2 marks]
Question 6. [9 marks]
In an influential study, Mroz (1987) examined the determinants of married women’s decision to
participate in the labour market. The dependent variable in this analysis was InLF, which is an
indicator variable equal to 1 if the woman was in the labour force (and 0 otherwise), and the ex-
planatory variables were educ (years of education), exper (years of labour market experience) and
exper2, age (in years), kids05 (number of kids aged 5 years and younger), kid6+ (number of kids
aged 6–18 years) and otherinc (non-labour market income measured in $1000). The table below
presents coefficient estimates (and standard errors) for several Probit and Logit models.
5
Table 2: Estimation Results for Labour Force Participation Models
Variables Probit (1) Probit (2) Logit (3)
educ 0.088 0.072 0.038
(0.017) (0.016) (0.007)
exper 0.082 0.082 0.035
(0.013) (0.012) (0.006)
exper2 0.0013 -0.0012 0.0006
(0.001) (0.0004) (0.0004)
age -0.035 -0.019 -0.015
(0.006) (0.004) (0.003)
kids05 -0.577 – -0.249
(0.082) – (0.0354)
kids6+ 0.024 – 0.0104
(0.030) – (0.013)
otherinc -0.084 -0.008 -0.0363
(0.003) (0.003) (0.001)
constant 0.170 0.426 0.073
(0.344) (0.278) (0.149)
Observations (n) 753 753 753
Log-Likelihood Value (LLF ) -397.52 -412.11 -398.10
Note: The numbers in parentheses (·) represent standard errors.
(i) What are the limitations of using the Linear Probability Model (LPM) to analyse binary de-
pendent variables such as InLF ? [1 mark]
(ii) The Probit and Logit models were estimated using the Maximum Likelihood Estimator (MLE).
What are the advantages of the Probit model MLE over the LPM? Outline the important prop-
erties of the MLE for the Probit Model. [2 marks]
(iii) Based on the estimation results for Probit model (1), what is the approximate partial effect of
an increase in experience on the probability of a woman being in the labour force? [2 marks]
(iv) Test the null hypothesis that the number of kids have no effect on married women’s labour
force participation using the Likelihood Ratio Test and a 1% significance level. The Probit (2)
model presents the estimation results for the restricted model, when the kids05 and kids6+
variables are excluded. [2 marks]
(v) The results in Table 2 also contains the coefficient estimates for the Logit model. Are the mag-
nitudes of the coefficient estimates from the Probit and Logit models directly comparable? If
6
not, what rule of thumb can be used for comparing the magnitude of the two sets of coefficient
estimates across models? [2 marks]
Question 7. [7 marks]
An econometric study examined an individual’s choice of occupation, where occupations were di-
vided into 3 types: blue collar, white collar and professional. The determinants of occupational
choice examined were educ (years of education) and an indicator variablemale (=1 if the individual
is male and 0 otherwise). The econometrician used the Multinomial Logit (MNL) model specifica-
tion, with occupation j defined as 1 (if blue collar), 2 (if white collar) and 3 (if professional), and
the following functional form:
P (occupationi = j) =
exp (βj xi)
1 +
∑3
r=1 exp (βr xi)
, j = 1, 2, 3 (9)
Using the MNL model to analyse the dependent variable occupationi, the estimates produced by
STATA are presented in the table below:
Table 3: Multinomial Logit Estimates
White collar Professional
Variable βˆ2 Std. Err. βˆ3 Std. Err.
educ 0.4288 (0.0837) 0.8149 (0.0621)
male 4.9961 (1.0610) 5.2086 (2.3989)
intercept -6.6539 (0.7392) -15.0779 (1.3676)
R˜2 0.0781
Observations (n) 1344
Log-Likelihood Value (LLF ) -770.28
Note: The numbers in parentheses (·) represent standard errors.
(i) What are the limitations of the Multinomial Logit (MNL) model for analysing occupational
choice? Briefly explain your answer. [1 mark]
(ii) What is the interpretation of the coefficient on educ in the estimated equation for the white
collar occupations? That is, what does the value 0.4288 represent? What, if anything, do we
learn about the marginal effect on education on the probability of an individual choosing a
white collar occupation? Briefly explain your answer. [2 marks]
(iii) To assess the importance of education in determining occupational choices, the MNL model
was re-estimatedwith education excluded. The Log-LikelihoodValue (LLF ) for this restricted
model was −774.78. Carry out a Likelihood Ratio Test of the null hypothesis that education
has no effect on occupational choices using a 5% significance level. [2 marks]
7
(iv) Social scientist have ranked jobs by their ‘occupational status’ – with professional occupations
ranked higher than white collar occupations, which in turn are ranked higher than blue collar
occupations (i.e. blue collar < white collar < professional). Therefore an alternative to the
Multinomial Logit model for analysing occupational choice is the Ordered Logit model. What,
if any, are the advantages of the Ordered Logit specification over the MNL in this situation?
[2 marks]
8
essay、essay代写