xuebaunion@vip.163.com

3551 Trousdale Rkwy, University Park, Los Angeles, CA

留学生论文指导和课程辅导

无忧GPA：https://www.essaygpa.com

工作时间：全年无休-早上8点到凌晨3点

微信客服：xiaoxionga100

微信客服：ITCS521

数学代写-A13073W1

时间：2021-04-06

A13073W1

DEGREE OF MASTER OF SCIENCE IN FINANCIAL ECONOMICS

FINANCIAL ECONOMETRICS

TRINITY TERM 2020

Tuesday, 21 April 2020

Time allowed is FOUR HOURS

You MUST upload your submission within 4 hours of accessing the paper

Candidates should answer ALL questions in part A.

Candidates should answer THREE of questions in part B.

Candidates should answer ALL questions in part C.

Examiners will place weight 2% on each question in Part A (40% total),

10% on each question in part B and 30% on Part C.

Please use the solutions template provided if possible.

Materials: Candidates may use their any calculator or any other software when preparing their

answer.

Do not turn over until told that you may do so.

1

PART A: MULTIPLE CHOICE

Answer ALL questions in this section.

The section contributes 40% towards the final mark. Each question is worth 2% of the exam mark.

Single Answer Questions

Select a single answer for each question. Correct marks are awarded 2%, and incorrect marks reduce

the score by 0.5% so that random guessing does not improve your expected mark.

1. If you flip three fair coins, what is the probability that all three show the same side?

(a) 1/2

(b) 1/8

(c) 1/16

(d) 1/4

2. When evaluating a series of forecasts using the Mincer-Zarnowitz regression yt+h− yˆt+h|t =

α + β yˆt+h|t + ηt+h, what are the values of α and β that should occur when the forecasting

model is correctly specified.

(a) β = 0, no restriction on α

(b) α = 0,β = 0

(c) α = 0,β = 1

(d) β = 1, no restriction on α

3. Daily return on a portfolio are i.i.d.with a N

(

0.05%,(1%)2

)

distribution. What is the 1-

week 5% Value-at-Risk or a portfolio with £1,000,000,000 under management (to the nearest

10,000)?

(a) 2,500,000

(b) 34,280,000

(c) 41,330,000

(d) 4,130,000

4. How are Diebold-Mariano tests used to compare the forecasts from two VaR models?

A13073W1 2

(a) The two series of HITs from the models are used as losses.

(b) HITs are regressed on lagged HITs and the two VaR forecasts.

(c) Diebold-Mariano tests cannot be used to compare two VaR models.

(d) The two series of HITs are transformed using the tick-loss function, and then the differ-

ence is tested.

5. How are simulated values used in Historical Simulation VaR?

(a) They are not. The quantile depends only on α and the sample size n.

(b) They rely on simulated models produced in the past.

(c) Multi-step VaR uses values simulated from historical observations.

(d) Returns are first filtered by an ARCH-type model and then simulated forward to estimate

the VaR.

6. What restrictions are required on Φ1 for a VAR(1)

yt =Φ1yt−1+ εt

to be stationary?

(a) All eigenvalues must be positive.

(b) All eigenvalues must be less than 1 in modulus.

(c) VAR(1) models are always covariance stationary. Only VAR(P) models can be non-

stationary.

(d) All values in Φ1 must be less than 1 in absolute value.

7. The Local in Local Average Treatment Effects refers to:

(a) The estimate of the ATE is weighted to reflect the probability of participation in a ran-

domized controlled trial (RCT).

(b) The reality the all experiments only use subjects in one locality.

(c) That the maximum likelihood estimator of the ATE may achieve a local rather than a

global maximum.

(d) The effect is homogeneous across the treatment group.

8. Consider the model yt = yt−1+ εt , where εt

i.i.d.∼ N(0,σ2ε ). Which statement below false?

(a) E [yt ] grows over time.

(b) A regression of yt on xt with xt = 0.5xt−1 + vt where vt ∼ N(0,σ2v ) and where εt and vt

are independent can result in spurious correlation.

A13073W1 3 TURN OVER

(c) There is no mean reversion in long-run forecasts of yt so that Et [yt+h] = yt for any h≥ 1.

(d) In a regression yt = βyt−1+ εt , our OLS estimate βˆ would be inconsistent.

9. How does the GJR-GARCH model improve on the GARCH model?

(a) It models the log-variance instead of the variance, and so is always positive.

(b) It allows for more lags of the squared return and variance.

(c) It adds an asymmetry term that depends on the sign of the past return.

(d) It models the standard deviation instead of the variance.

10. In a hypothesis test, the power of the test is

(a) The probability the null is rejected given the alternative is true.

(b) the probability of a Type I error.

(c) the probability of a Type II error.

(d) The probability that the alternative is true.

Multiple Answer Questions

Select all correct answers. Each question has between 0 and n correct answers where n is the number

of options in the questions. Each of the n answers is treated as a true-false question so that a correct

subpart answer is awarded 2%/n, and an incorrect answer reduces the mark by 2%/n. For example, if

the correct answers in a 5-part question are A, D, and E, then an answer with A is awarded 0.4%,

and an answer without A is reduced by 0.4%. An answer with B is reduced by 0.4%, and an answer

without B is awarded 0.4%. Random guessing does not improve your expected mark.

11. If E[Y |X ] = 0, where X is uniformly distributed on [5,10], then which of the following state-

ments are true:

(a) E [g(Y ) |X ] = 0 for any well defined function g(·)

(b) E[Y X ] = 0

(c) E[Y/X] = 0

(d) Cov [X ,Y ] = 0

12. What restrictions are required in an APARCH model to get an ARCH(1)?

σδt = ω+α (|εt−1|+ γεt−1)δ +βσδt−1

(a) γ=0

(b) β = 0

A13073W1 4

(c) δ = 0

(d) β = 1−α

(e) δ = 2

13. Why is the companion form of a VAR(P) useful?

(a) It allows any VAR(P) to be expressed as a VAR(1)

(b) It simplifies computing the autocovariance function of a VAR.

(c) At allows an AR(P) to be written as a VAR(1)

(d) It transforms a VAR to ensure that it is covariance stationary.

14. Which of the following are true about the expectations operator, E [·]:

(a) E [XY ] = Cov [X ,Y ] only when the random variables X and Y are independent

(b) If g(·) is a concave function, then E [g(X)]≥ g [E [X ]]

(c) V [a+bX ] = bV [X ] where a and b are constants and X is a random variable

(d) E [E [X |Y ]] = E [X ] only if X and Y are independent

15. Which are true of a vector white noise process {ε t}?

(a) The elements in the sequence {ε t} must have no contemporaneous correlation.

(b) The elements in the sequence {ε t} must have no correlation across time.

(c) The vector must be independent across time.

(d) Et [ε t |ε t−1,ε t−2, . . .] = 0

(e) The vector must be conditionally homoskedastic.

16. Assuming yt is covariance stationary and is generated by a VAR(P) with white noise residuals,

which of the VAR order selection methods lead to consistent lag length selection?

(a) BIC (Bayesian)

(b) AIC (Akaike)

(c) Likelihood-ratio

(d) HQIC (Hannan-Quinn)

17. The central limit theorem ...

(a) holds with finite and asymptotic samples sizes.

(b) forms a basis for inference on the OLS regression parameters.

(c) is a distributional statement for the sample mean.

A13073W1 5 TURN OVER

(d) states that the sample mean converges to the population mean for i.i.d.data with finite

variance.

18. Which statements are true about about outliers?

(a) A finite number of outliers result in biased OLS estimates of linear regression coefficients.

(b) A finite number of outliers result in inconsistent OLS estimates of linear regression coef-

ficients.

(c) Windsorization and trimming identify outlying observations by the most extreme realiza-

tions of the dependent variable.

(d) Trimming removes observations identified as outliers.

19. Which regression specifications can be estimated with linear regression assuming xi is observ-

able, E [εi|X ] = 0 and E [εi|Yi−1] = 0?

(a) yi = β1x

β2

i εi, εi > 0

(b) yi = β1x

β2

i + εi, εi > 0

(c) yi =

√

σ2i εi, with σ

2

i = ω+αy2i−1+βσ

2

i−1

(d) yi = β1 sinxi+β2 lnxi+ εi, xi > 0

20. Which of the following terms are sources of non-stationarity in the time-series model yt =

φ+δ t+yt−1+β1x1t +β2x2t +εt , where εt

i.i.d.∼ N(0,σ2ε ) and xt i.i.d.∼ N (0,Σ) are bivariate normally

distributed?

(a) The intercept φ .

(b) The correlated regressors x1t and x2t .

(c) The deterministic trend (δ t).

(d) The lag (yt−1).

A13073W1 6

PART B: LONG ANSWER

Answer THREE of the seven questions in this section.

Each question is worth 10% of the exam mark (i.e., 1/3 of 30%). Within each question points sum

to 100% and so will be scaled by 10% when combined in the final exam mark. Answers must be as

precise as possible, i.e., should use mathematical notation and formulae where relevant.

1. Suppose

yt =

[

0.2 0.4

0.0 0.6

]

yt−1+

[

0.0 0.4

0.1 0.3

]

yt−2+ ε t

where ε t is a vector white noise process with covariance Σ. Answer the following questions

about the VAR above:

(a) [30%] In bivariate a VAR(2), what restrictions on the model’s parameters are implied if y2

does not Granger Cause y1?

(b) [30%] Write the model in error correction form.

(c) [20%] Using the coefficient matrix on yt−1 in the ECM, determine if the model is (a)

a cointegrated VAR, (b) 2 random walks, or (c) covariance stationary. Note that in a 2

by 2 matrix A, the eigenvalues are the solution to λ1λ2 = a11a22− a12a21 and λ1+λ2 =

a11+a22.

(d) [20%] What are the 1-step and 2-step ahead forecast from the ECM for Et [yt+h], h = 1,2.

2. Suppose two assets, X and Y , have are bivariate normally distributed with µX = 8%, µY = 5%,

σ2X = 0.252, σ2Y = 0.152 and ρXY =−0.3.

(a) [20%] What the expected return to a portfolio Z = wX +(1−w)Y ?

(b) [20%] What is the variance of Z as a function of w?

(c) [40%] What value of w minimizes the variance of the portfolio?

(d) [20%] What value of w maximizes the Sharpe ratio of the portfolio, E[Z]/

√

V [Z]?

3. Suppose you observe a sequence of n i.i.d.data from a Poisson(λ ) distribution where each

observation has pmf

f (x;λ ) =

λ x exp(−λ )

x!

.

(a) [30%] What is the MLE of λˆ?

(b) [20%] What is the asymptotic distribution of the MLE?

A13073W1 7 TURN OVER

Use the sample

{4,5,4,6,3,5,5,6,3,3}

to answer the questions (c) and (d).

(c) [25%] Using the data above, test the null H0 : λ = 3.3 using a t-test and a 5% test size.

The lower-tail quantiles from a normal distribution are in the table below.

(d) [25%] Repeat the test in (c) using a Likelihood ratio test.

Quantile Value

1% -2.32

2.5% -1.95

5% -1.64

10% -1.28

4. Answer the following two questions about causal inference:

(a) [50%] Describe 2 methods to identify a causal effect in observational (non-experimental)

data. Compare the two methods and discuss their advantages and limitations.

(b) [50%] How might an RCT, which is often referred to as the gold standard for causal effect

estimation, produce a misleading estimate?

5. Suppose yt = φ0+φ1yt−1+φ2yt−2+φ12yt−12+θεt−1+ εt and εt

i.i.d.∼WN(0,σ2ε ). For the parts

(a) - (c) assume that parameters are consistent with yt being a covariance stationary process.

Answer the following questions:

(a) [20%] What is the value of E[yt+2]?

(b) [20%] What is the value of Et [yt+2]?

(c) [20%] What is the value of limh→∞Et [yt+h]?

(d) [20%] Now we do not assume yt to be covariance stationary. Let φ0 = 8, φ1 = 0.8, φ2 =

−0.15 and θ = 12. Is yt stable for the given parameters?

(e) [10%] Rewrite the model using only differenced data ∆yt , ∆yt−1, ∆yt−2. . . and yt−1?

(f) [10%] How would you describe the process {yt} if the coefficient on yt−1 is 0 in this form?

6. If lnRVt is modeled as a HAR

lnRVt = 0.1+0.4lnRVt−1+0.3lnRVt−1:5+0.22lnRVt−1:22+ εt

where εt ∼ N(0,σ2) where lnRVt−1:h = h−1

∑h

i=1 lnRVt−i is the average of h lags of lnRV .

(a) [20%] What is Et [lnRVt+1]?

(b) [20%] What is Et [lnRVt+2]?

(c) [20%] What is limh→∞Et [lnRVt+h]?

A13073W1 8

(d) [20%] What is the conditional distribution of the 2-step forecast error, lnRVt+2−Et [lnRVt+2]?

(e) [10%] What is Et [RVt+1]?

(f) [10%] What is Et [RVt+2]?

7. Suppose the correct model is

yi = x1,iβ1+ x2,iβ2+ εi, (1)

where i = 1, ...,n and the researcher estimates

yi = x1,iβ1+ vi.

(a) [20%] What is the effect on βˆ2 of omitting the variable x1,i in the regression equation?

(b) [20%] Is there always a cost of missing x2,i in the regression equation? If not, give two

examples.

(c) [20%] Suppose the researcher did not include x2,i because she cannot access this variable.

Explain how she can use an instrumental variable zi to fix the problematic estimate βˆ1.

Now suppose the correct model is

yi = x1,iβ1+ εi,

and the researcher estimates the larger model

yi = x1,iβ1+ x2,iβ2+ εi.

(d) [20%] What is the cost of adding an unnecessary variable x2,i?

(e) [20%] Explain how the researcher can use cross-validation to select which variables to

include in cross-sectional regression models.

A13073W1 9 TURN OVER

PART C: EXTENDED ANSWER

Answer ALL questions in this section.

The section contributes 30% towards the final mark.

1. Your colleague has built two new models for Value-at-Risk, both using magical machine learn-

ing methods (Models 2 & 3). Your colleague needs you to validate that her machine learning

approach is a good alternative to Filtered Historical Simulation (Model 1). While she did not

leave you the code or the raw data, she has produced some basic statistics and visualizations

(Tables 1 and 2 and Figure 1). All models are fit to the same return data, and all results are

out-of-sample.

(a) [67%] Use the available measures to construct the best story you can about whether you

think your firm should move to one of the machine-learning-based Value-at-Risk or remain

with Filtered Historical Simulation. The best answers will use mathematical notation were

relevant and compute statistics using the data in the tables when these can be transformed

into measures of absolute or relative performance of the models.

(b) [33%] Why do we use the tick-loss function when forecasting Value-at-Risk? Explain

how the tick-loss function is like the Mean Square Error (MSE) loss function that is used

when foresting the conditional mean or the Quasi-likelihood-loss (QLIK) function that is

used when forecasting the conditional variance.

A13073W1 10

Statistics Computed using the HITs

Summary Statistics

Model 1 Model 2 Model 3

µˆ 0.0111 -0.0126 0.0164

σˆ 0.3144 0.2824 0.3209

σˆNW 0.2964 0.3831 0.3475

T 756 756 756∑T−1

t=1 I

[

rt<−VaR jt

]I[

rt+1<−VaR jt+1

] 7 13 28∑T−1

t=1

(

1− I[

rt<−VaR jt

])(1− I[

rt+1<−VaR jt+1

]) 594 636 607

Corr [HITt ,HITt+1] -0.03142 0.1200 0.2282

Covariance

(

Σˆ

)

Model 1 Model 2 Model 3

Model 1 0.0987 0.0577 0.0743

Model 2 0.0577 0.0796 0.0506

Model 3 0.0743 0.0506 0.1028

Long-run Covariance

(

ΣˆNW

)

Model 1 Model 2 Model 3

Model 1 0.0878 0.0744 0.0813

Model 2 0.0744 0.1467 0.0793

Model 3 0.0813 0.0793 0.1207

Table 1: This table contains statistics based on the sequence of HITs defined as I[

rt+1<−VaR jt+1

]−

α for models j = 1,2,3 where α = 10%. The top panel contains the mean of the HITs (µˆ), the

standard deviation of the HITs (σˆ ), the long-run standard deviation of the HITs computed as the

square root of a Newey-West variance using 12 lags (σˆNW ), the number of out-of-sample observations

(T ), the number of periods where a VaR violation (an exceedance) was followed by a VaR violation(∑T−1

t=1 I

[

rt<−VaR jt

]I[

rt+1<−VaR jt+1

]), the number of periods where no VaR violation was followed by

no VaR violation

(∑T−1

t=1

(

1− I[

rt<−VaR jt

])(1− I[

rt+1<−VaR jt+1

])), and the correlation of the HITs

across two consecutive periods (Corr [HITt ,HITt+1]). The middle panel contains the covariance of

the HITs across the three methods (Σˆ). The final panel contains the long-run covariance of the HITs

measured using a Newey-West covariance estimator with 12 lags (ΣˆNW ).

A13073W1 11 TURN OVER

Statistics Computed using the Tick Losses

Mean

Model 1 Model 2 Model 3

L¯ 0.1273 0.1712 0.1294

Covariance

(

Σˆ

)

Model 1 Model 2 Model 3

Model 1 0.0426 0.0433 0.0381

Model 2 0.0433 0.0752 0.0364

Model 3 0.0381 0.0364 0.0353

Long-run Covariance

(

ΣˆNW

)

Model 1 Model 2 Model 3

Model 1 0.1591 0.1783 0.1543

Model 2 0.1783 0.2332 0.1702

Model 3 0.1543 0.1702 0.1513

Table 2: The top panel contains the mean tick-loss for each of the models computed using α =

10%. The middle panel contains the covariance

(

Σˆ

)

of the tick-losses estimated using the standard

covariance estimator. The bottom panel contains an estimate of the long-run covariance

(

ΣˆNW

)

of the

tick-losses estimated using a Newey-West covariance estimator and 12 lags.

A13073W1 12

$QQ9RO

0

RG

HO

$QQ9RO

0

RG

HO

$QQ9RO

0

RG

HO

Fi

gu

re

1:

Pl

ot

s

of

th

e

V

aR

vi

ol

at

io

ns

fo

re

ac

h

of

th

e

th

re

e

m

od

el

s

(H

)a

lo

ng

w

ith

th

e

fit

te

d

vo

la

til

ity

,w

hi

ch

is

th

e

sa

m

e

in

al

lt

hr

ee

pa

ne

ls

si

nc

e

th

e

un

de

rl

yi

ng

as

se

ti

s

id

en

tic

al

.

A13073W1 13 LAST PAGE

学霸联盟

DEGREE OF MASTER OF SCIENCE IN FINANCIAL ECONOMICS

FINANCIAL ECONOMETRICS

TRINITY TERM 2020

Tuesday, 21 April 2020

Time allowed is FOUR HOURS

You MUST upload your submission within 4 hours of accessing the paper

Candidates should answer ALL questions in part A.

Candidates should answer THREE of questions in part B.

Candidates should answer ALL questions in part C.

Examiners will place weight 2% on each question in Part A (40% total),

10% on each question in part B and 30% on Part C.

Please use the solutions template provided if possible.

Materials: Candidates may use their any calculator or any other software when preparing their

answer.

Do not turn over until told that you may do so.

1

PART A: MULTIPLE CHOICE

Answer ALL questions in this section.

The section contributes 40% towards the final mark. Each question is worth 2% of the exam mark.

Single Answer Questions

Select a single answer for each question. Correct marks are awarded 2%, and incorrect marks reduce

the score by 0.5% so that random guessing does not improve your expected mark.

1. If you flip three fair coins, what is the probability that all three show the same side?

(a) 1/2

(b) 1/8

(c) 1/16

(d) 1/4

2. When evaluating a series of forecasts using the Mincer-Zarnowitz regression yt+h− yˆt+h|t =

α + β yˆt+h|t + ηt+h, what are the values of α and β that should occur when the forecasting

model is correctly specified.

(a) β = 0, no restriction on α

(b) α = 0,β = 0

(c) α = 0,β = 1

(d) β = 1, no restriction on α

3. Daily return on a portfolio are i.i.d.with a N

(

0.05%,(1%)2

)

distribution. What is the 1-

week 5% Value-at-Risk or a portfolio with £1,000,000,000 under management (to the nearest

10,000)?

(a) 2,500,000

(b) 34,280,000

(c) 41,330,000

(d) 4,130,000

4. How are Diebold-Mariano tests used to compare the forecasts from two VaR models?

A13073W1 2

(a) The two series of HITs from the models are used as losses.

(b) HITs are regressed on lagged HITs and the two VaR forecasts.

(c) Diebold-Mariano tests cannot be used to compare two VaR models.

(d) The two series of HITs are transformed using the tick-loss function, and then the differ-

ence is tested.

5. How are simulated values used in Historical Simulation VaR?

(a) They are not. The quantile depends only on α and the sample size n.

(b) They rely on simulated models produced in the past.

(c) Multi-step VaR uses values simulated from historical observations.

(d) Returns are first filtered by an ARCH-type model and then simulated forward to estimate

the VaR.

6. What restrictions are required on Φ1 for a VAR(1)

yt =Φ1yt−1+ εt

to be stationary?

(a) All eigenvalues must be positive.

(b) All eigenvalues must be less than 1 in modulus.

(c) VAR(1) models are always covariance stationary. Only VAR(P) models can be non-

stationary.

(d) All values in Φ1 must be less than 1 in absolute value.

7. The Local in Local Average Treatment Effects refers to:

(a) The estimate of the ATE is weighted to reflect the probability of participation in a ran-

domized controlled trial (RCT).

(b) The reality the all experiments only use subjects in one locality.

(c) That the maximum likelihood estimator of the ATE may achieve a local rather than a

global maximum.

(d) The effect is homogeneous across the treatment group.

8. Consider the model yt = yt−1+ εt , where εt

i.i.d.∼ N(0,σ2ε ). Which statement below false?

(a) E [yt ] grows over time.

(b) A regression of yt on xt with xt = 0.5xt−1 + vt where vt ∼ N(0,σ2v ) and where εt and vt

are independent can result in spurious correlation.

A13073W1 3 TURN OVER

(c) There is no mean reversion in long-run forecasts of yt so that Et [yt+h] = yt for any h≥ 1.

(d) In a regression yt = βyt−1+ εt , our OLS estimate βˆ would be inconsistent.

9. How does the GJR-GARCH model improve on the GARCH model?

(a) It models the log-variance instead of the variance, and so is always positive.

(b) It allows for more lags of the squared return and variance.

(c) It adds an asymmetry term that depends on the sign of the past return.

(d) It models the standard deviation instead of the variance.

10. In a hypothesis test, the power of the test is

(a) The probability the null is rejected given the alternative is true.

(b) the probability of a Type I error.

(c) the probability of a Type II error.

(d) The probability that the alternative is true.

Multiple Answer Questions

Select all correct answers. Each question has between 0 and n correct answers where n is the number

of options in the questions. Each of the n answers is treated as a true-false question so that a correct

subpart answer is awarded 2%/n, and an incorrect answer reduces the mark by 2%/n. For example, if

the correct answers in a 5-part question are A, D, and E, then an answer with A is awarded 0.4%,

and an answer without A is reduced by 0.4%. An answer with B is reduced by 0.4%, and an answer

without B is awarded 0.4%. Random guessing does not improve your expected mark.

11. If E[Y |X ] = 0, where X is uniformly distributed on [5,10], then which of the following state-

ments are true:

(a) E [g(Y ) |X ] = 0 for any well defined function g(·)

(b) E[Y X ] = 0

(c) E[Y/X] = 0

(d) Cov [X ,Y ] = 0

12. What restrictions are required in an APARCH model to get an ARCH(1)?

σδt = ω+α (|εt−1|+ γεt−1)δ +βσδt−1

(a) γ=0

(b) β = 0

A13073W1 4

(c) δ = 0

(d) β = 1−α

(e) δ = 2

13. Why is the companion form of a VAR(P) useful?

(a) It allows any VAR(P) to be expressed as a VAR(1)

(b) It simplifies computing the autocovariance function of a VAR.

(c) At allows an AR(P) to be written as a VAR(1)

(d) It transforms a VAR to ensure that it is covariance stationary.

14. Which of the following are true about the expectations operator, E [·]:

(a) E [XY ] = Cov [X ,Y ] only when the random variables X and Y are independent

(b) If g(·) is a concave function, then E [g(X)]≥ g [E [X ]]

(c) V [a+bX ] = bV [X ] where a and b are constants and X is a random variable

(d) E [E [X |Y ]] = E [X ] only if X and Y are independent

15. Which are true of a vector white noise process {ε t}?

(a) The elements in the sequence {ε t} must have no contemporaneous correlation.

(b) The elements in the sequence {ε t} must have no correlation across time.

(c) The vector must be independent across time.

(d) Et [ε t |ε t−1,ε t−2, . . .] = 0

(e) The vector must be conditionally homoskedastic.

16. Assuming yt is covariance stationary and is generated by a VAR(P) with white noise residuals,

which of the VAR order selection methods lead to consistent lag length selection?

(a) BIC (Bayesian)

(b) AIC (Akaike)

(c) Likelihood-ratio

(d) HQIC (Hannan-Quinn)

17. The central limit theorem ...

(a) holds with finite and asymptotic samples sizes.

(b) forms a basis for inference on the OLS regression parameters.

(c) is a distributional statement for the sample mean.

A13073W1 5 TURN OVER

(d) states that the sample mean converges to the population mean for i.i.d.data with finite

variance.

18. Which statements are true about about outliers?

(a) A finite number of outliers result in biased OLS estimates of linear regression coefficients.

(b) A finite number of outliers result in inconsistent OLS estimates of linear regression coef-

ficients.

(c) Windsorization and trimming identify outlying observations by the most extreme realiza-

tions of the dependent variable.

(d) Trimming removes observations identified as outliers.

19. Which regression specifications can be estimated with linear regression assuming xi is observ-

able, E [εi|X ] = 0 and E [εi|Yi−1] = 0?

(a) yi = β1x

β2

i εi, εi > 0

(b) yi = β1x

β2

i + εi, εi > 0

(c) yi =

√

σ2i εi, with σ

2

i = ω+αy2i−1+βσ

2

i−1

(d) yi = β1 sinxi+β2 lnxi+ εi, xi > 0

20. Which of the following terms are sources of non-stationarity in the time-series model yt =

φ+δ t+yt−1+β1x1t +β2x2t +εt , where εt

i.i.d.∼ N(0,σ2ε ) and xt i.i.d.∼ N (0,Σ) are bivariate normally

distributed?

(a) The intercept φ .

(b) The correlated regressors x1t and x2t .

(c) The deterministic trend (δ t).

(d) The lag (yt−1).

A13073W1 6

PART B: LONG ANSWER

Answer THREE of the seven questions in this section.

Each question is worth 10% of the exam mark (i.e., 1/3 of 30%). Within each question points sum

to 100% and so will be scaled by 10% when combined in the final exam mark. Answers must be as

precise as possible, i.e., should use mathematical notation and formulae where relevant.

1. Suppose

yt =

[

0.2 0.4

0.0 0.6

]

yt−1+

[

0.0 0.4

0.1 0.3

]

yt−2+ ε t

where ε t is a vector white noise process with covariance Σ. Answer the following questions

about the VAR above:

(a) [30%] In bivariate a VAR(2), what restrictions on the model’s parameters are implied if y2

does not Granger Cause y1?

(b) [30%] Write the model in error correction form.

(c) [20%] Using the coefficient matrix on yt−1 in the ECM, determine if the model is (a)

a cointegrated VAR, (b) 2 random walks, or (c) covariance stationary. Note that in a 2

by 2 matrix A, the eigenvalues are the solution to λ1λ2 = a11a22− a12a21 and λ1+λ2 =

a11+a22.

(d) [20%] What are the 1-step and 2-step ahead forecast from the ECM for Et [yt+h], h = 1,2.

2. Suppose two assets, X and Y , have are bivariate normally distributed with µX = 8%, µY = 5%,

σ2X = 0.252, σ2Y = 0.152 and ρXY =−0.3.

(a) [20%] What the expected return to a portfolio Z = wX +(1−w)Y ?

(b) [20%] What is the variance of Z as a function of w?

(c) [40%] What value of w minimizes the variance of the portfolio?

(d) [20%] What value of w maximizes the Sharpe ratio of the portfolio, E[Z]/

√

V [Z]?

3. Suppose you observe a sequence of n i.i.d.data from a Poisson(λ ) distribution where each

observation has pmf

f (x;λ ) =

λ x exp(−λ )

x!

.

(a) [30%] What is the MLE of λˆ?

(b) [20%] What is the asymptotic distribution of the MLE?

A13073W1 7 TURN OVER

Use the sample

{4,5,4,6,3,5,5,6,3,3}

to answer the questions (c) and (d).

(c) [25%] Using the data above, test the null H0 : λ = 3.3 using a t-test and a 5% test size.

The lower-tail quantiles from a normal distribution are in the table below.

(d) [25%] Repeat the test in (c) using a Likelihood ratio test.

Quantile Value

1% -2.32

2.5% -1.95

5% -1.64

10% -1.28

4. Answer the following two questions about causal inference:

(a) [50%] Describe 2 methods to identify a causal effect in observational (non-experimental)

data. Compare the two methods and discuss their advantages and limitations.

(b) [50%] How might an RCT, which is often referred to as the gold standard for causal effect

estimation, produce a misleading estimate?

5. Suppose yt = φ0+φ1yt−1+φ2yt−2+φ12yt−12+θεt−1+ εt and εt

i.i.d.∼WN(0,σ2ε ). For the parts

(a) - (c) assume that parameters are consistent with yt being a covariance stationary process.

Answer the following questions:

(a) [20%] What is the value of E[yt+2]?

(b) [20%] What is the value of Et [yt+2]?

(c) [20%] What is the value of limh→∞Et [yt+h]?

(d) [20%] Now we do not assume yt to be covariance stationary. Let φ0 = 8, φ1 = 0.8, φ2 =

−0.15 and θ = 12. Is yt stable for the given parameters?

(e) [10%] Rewrite the model using only differenced data ∆yt , ∆yt−1, ∆yt−2. . . and yt−1?

(f) [10%] How would you describe the process {yt} if the coefficient on yt−1 is 0 in this form?

6. If lnRVt is modeled as a HAR

lnRVt = 0.1+0.4lnRVt−1+0.3lnRVt−1:5+0.22lnRVt−1:22+ εt

where εt ∼ N(0,σ2) where lnRVt−1:h = h−1

∑h

i=1 lnRVt−i is the average of h lags of lnRV .

(a) [20%] What is Et [lnRVt+1]?

(b) [20%] What is Et [lnRVt+2]?

(c) [20%] What is limh→∞Et [lnRVt+h]?

A13073W1 8

(d) [20%] What is the conditional distribution of the 2-step forecast error, lnRVt+2−Et [lnRVt+2]?

(e) [10%] What is Et [RVt+1]?

(f) [10%] What is Et [RVt+2]?

7. Suppose the correct model is

yi = x1,iβ1+ x2,iβ2+ εi, (1)

where i = 1, ...,n and the researcher estimates

yi = x1,iβ1+ vi.

(a) [20%] What is the effect on βˆ2 of omitting the variable x1,i in the regression equation?

(b) [20%] Is there always a cost of missing x2,i in the regression equation? If not, give two

examples.

(c) [20%] Suppose the researcher did not include x2,i because she cannot access this variable.

Explain how she can use an instrumental variable zi to fix the problematic estimate βˆ1.

Now suppose the correct model is

yi = x1,iβ1+ εi,

and the researcher estimates the larger model

yi = x1,iβ1+ x2,iβ2+ εi.

(d) [20%] What is the cost of adding an unnecessary variable x2,i?

(e) [20%] Explain how the researcher can use cross-validation to select which variables to

include in cross-sectional regression models.

A13073W1 9 TURN OVER

PART C: EXTENDED ANSWER

Answer ALL questions in this section.

The section contributes 30% towards the final mark.

1. Your colleague has built two new models for Value-at-Risk, both using magical machine learn-

ing methods (Models 2 & 3). Your colleague needs you to validate that her machine learning

approach is a good alternative to Filtered Historical Simulation (Model 1). While she did not

leave you the code or the raw data, she has produced some basic statistics and visualizations

(Tables 1 and 2 and Figure 1). All models are fit to the same return data, and all results are

out-of-sample.

(a) [67%] Use the available measures to construct the best story you can about whether you

think your firm should move to one of the machine-learning-based Value-at-Risk or remain

with Filtered Historical Simulation. The best answers will use mathematical notation were

relevant and compute statistics using the data in the tables when these can be transformed

into measures of absolute or relative performance of the models.

(b) [33%] Why do we use the tick-loss function when forecasting Value-at-Risk? Explain

how the tick-loss function is like the Mean Square Error (MSE) loss function that is used

when foresting the conditional mean or the Quasi-likelihood-loss (QLIK) function that is

used when forecasting the conditional variance.

A13073W1 10

Statistics Computed using the HITs

Summary Statistics

Model 1 Model 2 Model 3

µˆ 0.0111 -0.0126 0.0164

σˆ 0.3144 0.2824 0.3209

σˆNW 0.2964 0.3831 0.3475

T 756 756 756∑T−1

t=1 I

[

rt<−VaR jt

]I[

rt+1<−VaR jt+1

] 7 13 28∑T−1

t=1

(

1− I[

rt<−VaR jt

])(1− I[

rt+1<−VaR jt+1

]) 594 636 607

Corr [HITt ,HITt+1] -0.03142 0.1200 0.2282

Covariance

(

Σˆ

)

Model 1 Model 2 Model 3

Model 1 0.0987 0.0577 0.0743

Model 2 0.0577 0.0796 0.0506

Model 3 0.0743 0.0506 0.1028

Long-run Covariance

(

ΣˆNW

)

Model 1 Model 2 Model 3

Model 1 0.0878 0.0744 0.0813

Model 2 0.0744 0.1467 0.0793

Model 3 0.0813 0.0793 0.1207

Table 1: This table contains statistics based on the sequence of HITs defined as I[

rt+1<−VaR jt+1

]−

α for models j = 1,2,3 where α = 10%. The top panel contains the mean of the HITs (µˆ), the

standard deviation of the HITs (σˆ ), the long-run standard deviation of the HITs computed as the

square root of a Newey-West variance using 12 lags (σˆNW ), the number of out-of-sample observations

(T ), the number of periods where a VaR violation (an exceedance) was followed by a VaR violation(∑T−1

t=1 I

[

rt<−VaR jt

]I[

rt+1<−VaR jt+1

]), the number of periods where no VaR violation was followed by

no VaR violation

(∑T−1

t=1

(

1− I[

rt<−VaR jt

])(1− I[

rt+1<−VaR jt+1

])), and the correlation of the HITs

across two consecutive periods (Corr [HITt ,HITt+1]). The middle panel contains the covariance of

the HITs across the three methods (Σˆ). The final panel contains the long-run covariance of the HITs

measured using a Newey-West covariance estimator with 12 lags (ΣˆNW ).

A13073W1 11 TURN OVER

Statistics Computed using the Tick Losses

Mean

Model 1 Model 2 Model 3

L¯ 0.1273 0.1712 0.1294

Covariance

(

Σˆ

)

Model 1 Model 2 Model 3

Model 1 0.0426 0.0433 0.0381

Model 2 0.0433 0.0752 0.0364

Model 3 0.0381 0.0364 0.0353

Long-run Covariance

(

ΣˆNW

)

Model 1 Model 2 Model 3

Model 1 0.1591 0.1783 0.1543

Model 2 0.1783 0.2332 0.1702

Model 3 0.1543 0.1702 0.1513

Table 2: The top panel contains the mean tick-loss for each of the models computed using α =

10%. The middle panel contains the covariance

(

Σˆ

)

of the tick-losses estimated using the standard

covariance estimator. The bottom panel contains an estimate of the long-run covariance

(

ΣˆNW

)

of the

tick-losses estimated using a Newey-West covariance estimator and 12 lags.

A13073W1 12

$QQ9RO

0

RG

HO

$QQ9RO

0

RG

HO

$QQ9RO

0

RG

HO

Fi

gu

re

1:

Pl

ot

s

of

th

e

V

aR

vi

ol

at

io

ns

fo

re

ac

h

of

th

e

th

re

e

m

od

el

s

(H

)a

lo

ng

w

ith

th

e

fit

te

d

vo

la

til

ity

,w

hi

ch

is

th

e

sa

m

e

in

al

lt

hr

ee

pa

ne

ls

si

nc

e

th

e

un

de

rl

yi

ng

as

se

ti

s

id

en

tic

al

.

A13073W1 13 LAST PAGE

学霸联盟