STAT0010 In-Course Assessment 3 for 2020/2021 Page 1
STAT0010
In-Course Assessment 3 for 2020/2021
Answer ALL questions.
The relative weights attached to each question are as follows: Question 1 (4 marks), Question 2
(19 marks), Question 3 (7 marks), Question 4 (14 marks), Question 5 (16 marks), Question 6 (40
marks). The numbers in square brackets indicate the relative weight attached to each part question.
Marks will not only be given for the final (numerical) answer but also for the accuracy and clar-
ity of the answer. So make sure to write down workings, e.g. formulas, calculations, reasoning.
Show your full working for all questions. Do not write formulas alone without any comment about
what you are calculating.
This assessment counts for 30% towards your final STAT0010 mark.
Administrative details
This is an open-book exam. You may use your course materials to answer questions. Some
questions may ask you to solve, or not to solve, a problem in a particular way; please take note
of this. Failure to do so may result in marks being deducted.
You may not contact the course lecturer with any questions, even if you want to clarify
something or report an error on the paper. If you have any doubts about a question, make a note
in your answer explaining the assumptions that you are making in answering it.
You will receive a provisional grade during one of the weeks following the examination –
grades are provisional until confirmed by the Statistics Examiners’ Meeting in June 2021. You
will also obtain feedback on your exam and study the comments written on your marked work.
Formatting your solutions for submission
For all questions, you may choose to type or hand-write your answers.
You should submit ONE document that contains your solutions for all questions/ part-questions.
Please follow UCL’s guidance on combining text and photographed/ scanned work.
Make sure that your handwritten solutions are clear and are readable in the document you
submit. You are encouraged to write out solutions neatly once you are happy with them.
TURN OVER
STAT0010 In-Course Assessment 3 for 2020/2021 Page 2
Plagiarism and collusion
You must work alone. In particular, any discussion of the paper with anyone else is not accept-
able. You are encouraged to read the Department of Statistical Science’s advice on collusion
and plagiarism, which you can find here.
If there is any doubt as to whether the solutions you submit are entirely your own work you
may be required to participate in an investigatory viva to establish authorship.
CONTINUED
STAT0010 In-Course Assessment 3 for 2020/2021 Page 3
• Unless otherwise indicated, in all questions {t} denotes a sequence of uncorrelated zero-mean
random variables with constant variance σ2, i.e. {t} ∼WN(0, σ2), where WN(0, σ2) denotes
white noise.
• The 97.5th percentile point of the standard Normal distribution is 1.96.
• The transpose of a vector or matrix A is denoted by AT .
• For a process {Yt}, we define∇Yt = Yt − Yt−1 and ∇sYt = Yt − Yt−s.
QUESTION 1 [4]
Generate 450 observations from the ARMA(p, q) model given by
Yt = 0.8Yt−1 + t − 0.8t−1.
Plot the data, compute the sample ACF and fit an ARMA(1,1) model to the data. How do you explain
the results? Justify your answer.
QUESTION 2 [19]
Fix s ≥ 2. Consider the seasonal state-space model
Yt = µt + γt + t
µt = µt−1 + ht
γt = γt−s + zt,
where {t}, {ht} and {zt} are independent white noise sequences with variances σ2 , σ2h and σ2z ,
respectively.
(a) Which time series model does∇∇sYt correspond to? [3]
(b) Compute the autocorrelation function of∇∇sYt. [3]
(c) Write Yt in the form
Yt = BTSt + t
St = CSt−1 +Ht ,
where St is an appropriately defined state vector, B is a vector, C is a matrix andHt is a vector-
valued white noise process with mean 0 and variance-covariance matrix V. Give the values of
B, C and V. [6]
(d) Assume now that σ2 = σ
2
h = σ
2
z = 1.
(i) Use the initial conditions:
Sˆ0|0 = 0 , P0,0 = V,
to predict the first observation and prior state error variance. [4]
TURN OVER
STAT0010 In-Course Assessment 3 for 2020/2021 Page 4
(ii) At time t = 1, we observe that Yt = 1/5. Compute the Kalman gain, posterior state
update, and posterior state error variance at t = 1. [3]
QUESTION 3 [7]
Find a state-space model for {Yt} when {∇∇12Yt} is a stationary ARMA(2, 2) process.
QUESTION 4 [14]
For Question 4, pick only one of the two options A and B below.
QUESTION 4A
The file Data1.csv contains UK airline passenger data from 1963 to 1970.
(a) Analyse this Time Series and choose a SARIMA(p, d, q)× (P,D,Q)s model that fits this data.
You need to fit at least 3 candidate models and choose only one model from the candidate mod-
els. Justify your choice. Estimate the parameters of your chosen model. Carry out appropriate
diagnostic checks for the model. [8]
(b) With your chosen model from (a), generate forecasts for 5 steps in the future. [3]
(c) Show on the same plot your chosen time series together with the above forecasts and up-
per/lower bands highlighting the 90% confidence interval for all 5 forecasts. Comment on
the width of the confidence intervals. [3]
Please include the R code with your answers.
QUESTION 4B
Fix s ≥ 4. Assume that we have an airline passenger model, with data modelled by the invertible
time series:
(1−B)2Yt = (1− θ1B − θ2Bs − θ3Bs+1)t,
for suitable parameters θ1, θ2 and θ3.
(a) Assume the data is given up to time T. Calculate the h-step ahead optimal forecast of YˆT+h for
all h = 1, 2, . . . , s. What happens for h > s? [6]
(b) Compute the h-step ahead forecast error. Compute the variance of the h-step ahead forecast
error for h = 1, 2, . . . , s. [5]
(c) Compute cov(Yt, t). [3]
QUESTION 5 [16]
CONTINUED
STAT0010 In-Course Assessment 3 for 2020/2021 Page 5
For Question 5, pick only one of the two options A and B below.
QUESTION 5A
The file Varves.txt contains glacial yearly varve thickness data, which were collected at a location
in Massachusetts for 634 years, beginning 11834 years ago.
[According to Wikipedia, a varve is an annual layer of sediment or sedimentary rock.]
(a) Analyse this Time Series and choose a SARIMA(p, d, q)× (P,D,Q)s model that fits this data.
You need to fit at least 2 candidate models and choose only one model from the candidate mod-
els. Justify your choice. Estimate the parameters of your chosen model. Carry out appropriate
diagnostic checks for the model. [3]
(b) Use now the logarithms of the original data, and choose a SARIMA(p, d, q)×(P,D,Q)s model
that fits the data. You need to fit at least 2 candidate models and choose only one model from
the candidate models. Justify your choice. Estimate the parameters of your chosen model.
Carry out appropriate diagnostic checks for the model. [3]
(c) Based on your analysis in (a) and (b), decide whether to use as your chosen time series the
original data or the logarithms of the original data. Justify your choice. [2]
(d) With your chosen model between (a) and (b), generate forecasts for 10 steps in the future. [2]
(e) Show on the same plot your chosen time series together with the above forecasts and up-
per/lower bands highlighting the 95% confidence interval for all 10 forecasts. Comment on
the width of the confidence intervals. [2]
(f) Perform now an out-of-sample validation on the log data. Comment on the results compared to
the earlier results from (d) and (e). [4]
Please include the R code with your answers.
QUESTION 5B
(a) Assume that {Xt} satisfy the equations
Xt = φXt−1 + t, t = 0,±1,±2, . . . ,
where |φ| > 1 and t ∼WN(0, σ2).
(i) Using the formula
Xt = − 1
φ
t+1 +
1
φ
Xt+1,
write Xt as an infinite series involving t+j terms, where j ≥ 1. Justify your answer. [5]
(ii) Define the new sequence
Wt = Xt − 1
φ
Xt−1.
Using the infinite series representation of Xt from (i), show that {Wt} ∼ WN(0, σ2W ),
and express σ2W in terms of σ
2 and φ. [4]
TURN OVER
STAT0010 In-Course Assessment 3 for 2020/2021 Page 6
(b) Let Zt = a+ bt+Yt, where {Yt, t = 0,±1, . . . , } is an independent and identically distributed
sequence of random variables with mean 0 and variance σ2, and a and b are constants. Define
for a fixed q ∈ N the time series
Ft = (2q + 1)
−1
q∑
j=−q
Zt+j .
Compute the mean and autocovariance function of {Ft}. Is Ft stationary? [7]
QUESTION 6 [40]
Write a short report of approximately 4-6 pages on a more advanced time series topic that was not
covered in the module, and illustrate the topic via examples and/or coding/simulation. As guidelines,
the report should be approximately 1000 words. Mathematical formulas are not counted in the word
count. Any coding, which is optional, is not counted in the word count. All the sources and literature
used in the report should be clearly cited at the appropriate places, and should be contained in a
bibliography. The bibliography is also not counted in the word count. The report can be typed or
handwritten.
The following are some suggestions of potential topics, though you are free to choose your own
topic and directions. If you choose a topic that is not listed here, you are welcome to have a chat with
me to make sure it is suitable. It is of course possible that more than one person chooses the same or
similar topic. It is important that each person works on their own; in particular, two persons working
on the same or similar topic should not be helping each other on the report. Please feel free to discuss
with me for more details and directions regarding the following suggestions.
(a) Introduce and explain the estimation of the parameters for ARMA(p, q) models via maximum
likelihood.
(b) Introduce and explain Burg’s algorithm and Innovations algorithm.
(c) Give a brief introduction to ARCH/GARCH models, and their applications. Alternatively, you
can explain in more detail one particular volatility model of your choice.
(d) Give a short introduction to spectrum analysis. Potential subtopics here are spectral densities
or the periodogram.
(e) Introduce and explain the Holt and the Holt-Winters forecasting procedures.
(f) Explain the estimation of missing values for state-space models.
(g) Give a short introduction to Bayesian forecasting.
(h) Give a brief introduction to generalized state-space models.
(i) Write a short report on a research paper in time series, working through some of the theory and
illustrating relevant examples via coding/simulation.
Below you can find references of some books, available online in the UCL library, where some of
the topics suggested above can be found. You are of course most welcome to find your own literature
to use for your chosen topic.
CONTINUED
STAT0010 In-Course Assessment 3 for 2020/2021 Page 7
References
[1] PETER BLOOMFIELD, Fourier analysis of time series an introduction, New York : Wiley: 2nd
ed, 2000.
[2] CHRISTOPHER CHATFIELD, HAIPENG XING, The analysis of time series : an introduction with
R, Boca Raton, FL : CRC Press: Seventh edition, 2019.
[3] RAQUEL PRADO, MIKE WEST, Time Series, Chapman and Hall, 1st edition, 2010.
[4] CHRISTIAN WEISS, An introduction to discrete-valued time series, Hoboken, New Jersey, Wi-
ley 1st edition, 2018.
END OF PAPER
学霸联盟