UA18-无代写
时间:2024-05-05
Name: ...................................................... Net ID: ...................
Statistics - ECON-UA18 Instructor: Alberto Bisin
New York University, Fall 2018
FINAL
Thursday, December 13
Reminders
• Write your name and your NYU Net ID (e.g. abc123 ) on top of this page.
• For the open questions, partial credit will be awarded for incomplete an-
swers, if you outline your reasoning and/or show your work.
• Write clearly and be concise. Don’t spend too much time on any given ques-
tion. Allocate your time wisely and in accordance with the points awarded
for each question.
1
1 Multiple Choice Questions
Circle the correct answer.
1. Consider a multiple regression where the dependent variable is the birth
weight of newborn babies (in ounces). We are interested in predicting birth
weight by a variety of observable characteristics, including the length of the
pregnancy in days (“gestation”), the number of times the mother has given
birth before (“parity”), the mother’s age in years (“age”), the mother’s height
in inches (“height”), her weight in pounds (“weight”), and an indicator vari-
able whether she smokes (“smoke”=1) or not (“smoke”=0). Based on a sam-
ple of N = 1236 observations, the summary table below shows the results of
a regression model where “birth weight” is the dependent variable:
A) How many degree of freedom has this regression?
B) All else equal, is a baby of a smoking mother predicted to have a signifi-
cantly lower birth weight than a baby of a non-smoking mother.
C) All else equal, do older mothers tend to give birth to babies with lower
birth weight?
D) What is the explanatory variable with the largest estimated slope coeffi-
cient (in absolute value)? Does this imply that this coefficient will also
have the lowest associated p-value.
E) Is “Parity” a relevant predictor of birth weight?
2. A researcher wants to conduct a multiple regression analysis, where Y is the
dependent variable and {X1, X2, X3, X4} are the explanatory variables. First,
he estimates the following model: Y = β0 + β1X1 + β2X2 + β3X3 + β4X4 + ,
and finds that R2 = 0.85 and R2adj = 0.82.
2
Then, he decides to drop X4 from the model. After estimating this new
regression Y = β0 + β1X1 + β2X2 + β3X3 + , he finds that R
2 = 0.42 and
R2adj = 0.40. You can assume that all the usual conditions for linear regression
are satisfied. Which of the following statements is false?
A) It must be the case that b4, the estimate of the unknown slope parameter
β4 in the first regression, is significantly smaller than 0.
B) Dropping the variable X4 from the model seems like a bad idea, since X4
clearly helps predict the dependent variable Y .
C) About 42% of the variance of Y can be explained by the variance in X1,
X2 and X3.
D) About 15% of the variance of Y cannot be explained by the variance in
X1, X2, X3 and X4.
E) Since the adjusted R-squared decreases a lot after dropping X4, the first
model seems preferable to the second one.
3. Consider a linear regression,
Y = β0 + β1X + .
Suppose you want to test β1 = 1. How do you proceed? That is, answer the
following:
(a) what is the equation for your estimate b1;
(b) what is the statistic you use in the test and how is it distributed (call the
standard error SEb1, no equation is needed);
(c) under which condition you do not reject β1 = 1 (call the cutoff value t95)?
(d) when in the test are you implicitly using the assumption var(|x) =constant?
4. Consider the same regression
Y = β0 + β1X + .
(a) Is it good or bad for the variance of all the observed explanatory variable
(the X on the right-hand-side) to be large?
(b) Suppose you want to predict the value of Y given some X1. Is your
prediction is more precise if X1 is far or close to its mean X1?
Explain carefully the argument - formally and intuitively.
3
5. Consider a linear regression
Y = β0 + β1X1 + β2X2 +
Is it a problem if X1 is positively correlated with X2? And if it is negatively
correlated with X2?
4


essay、essay代写