BEEM012 – Empirical Assignment Brief
Julian Dyer
Assignment Overview
The goal of this assignment is to use the tools you have learned so far in your R
assignments and apply them to an independent project on time series data of your
choice. I will be providing a few sample datasets that are easy for you to use from
which you can choose which one relates to a research question you find interesting.
Note: Remember that you can always subtract one time series from another
if you are interested in the difference between two outcomes. For example, we
considered the term spread, the difference between long and short run interest
rates, in some of our R assignments as a predictor of GDP growth. You can also
use this as an outcome, and look at the difference between profits in two different
sectors as your Yt or Xt or differences in outcomes for men and women as your Yt
or Xt, etc.
A Second Note: You are welcome to seek out your own data and explore an
independent research project if you wish to go above and beyond the assignment.
You will, however, need to complete the same analysis tasks listed in the assign-
ment. The grading scheme will be consistent for those using data I provide and
for those who find their own.
A Third Note: If you want to use this empirical work as the basis for your
dissertation that would be an excellent use of your effort. You should be aware,
however, that you cannot submit the exact same report for your dissertation as you
submit for this module, and your dissertation would need to contain substantively
additional content.
The first task is choosing an outcome variable that will be your Yt for your
analysis, and a primary Xt that will be the main explanatory variable you explore.
Once you have chosen some data of interest, the first part of this assignment
will involve using the tools we learned in the first part of the module (up to our
work with Dynamic Causal Effects) in order to explore what we can learn about
your outcome Yt as an Autoregressive process. You will complete the analytical
tasks outlined below by adapting the code provided in R tutorials and write up
an explanation of the task and the results. You will also use the tools of Volatility
Analysis we will cover later in the course to test whether the volatility or variance
of a time series is serially correlated.
The next step is to consider an additional explanatory variable, and estimate
the Dynamic Causal Effects of this explanatory variable on your outcome of inter-
est. You will complete the analytical tasks outlined below by adapting the code
provided in R tutorials and write up an explanation of the task and the results.
Where we have learned a manual tool to complete a task, you should
use this in your assignment. You are, however, free to use the automatic tools
to check your work.
You will then test two variables for Cointegration, in a formal test of whether
they move together and receive the same shocks. This can be the same as the
variables you have used previously, but you can also choose different variables.
Finally, you will estimate a model testing for Volatility Clustering in your time
series Yt using the Autoregressive Conditional Heteroskedasticity (ARCH) and
Generalized Autoregressive Conditional Heteroskedasticity (GARCH) models.
Grading Criteria
Your assignment will be assigned a grade based on three equally-weighted cate-
• Interpretation and Understanding of Econometric Tools Part of your
grade will be based on whether you correctly use and interpret the tools of
Time Series Econometrics that we learned. This means that you use the
appropriate models for the given task, that you interpret results correctly,
using the proper critical values for inference as well as interpreting null hy-
potheses correctly. This also depends on whether you explain why you use
different tools, and the problems these are selected to deal with.
• Programming and R Code Part of your grade will depend on correctly
using R to implement the tasks you are assigned and whether your R code
correctly implements the work that you describe in the write-up of your as-
signmnet. Marks will be given for R code that is correct, and with comments
to clarify you understand the tools you are using.
• Economic Analysis and Discussion This part of your grade will depend
on the economic analysis of your results and the depth of your discussion.
Marks will be given for the economic content of your analysis and your
interpretation of the economic reasoning of your results.
Assignment Outputs to Submit
• A write-up of the results of your analysis, including graphs and tables. See
the outline of the analysis tasks to complete below for details on exactly
what tables & graphs you need to complete.
Word Count: Maximum 2,500 words.
• Your R script for the assignment
1 Analysis Tasks to Complete
1.1 Descriptive Analysis
Before running regressions, we will first examine our data and use some simple
tools to look at the time series.
1.1.1 Data Description
First, write a very brief (just a few sentences) description of the outcome variable
you are interested in analysing. Next write a brief description of your primary
explanatory variable, and the rough research question.
1.1.2 Time Series Plots
Next, plot your Yt time series., and give a few sentences of description. Does it
appear to have a trend? Does it appear to be highly autocorrelated?
1.2 Autoregression Analysis of a Time Series
1.2.1 Estimate an Autoregression Model
• First, run an AR(1) regression of your outcome variable. Then use the Bayes
Information Criterion to select the appropriate lag length for your model,
setting a maximum of four lags. Write down the four values of the BIC(p)
you calculate, and explain which model length you end up selecting. Now,
estimate this model.
• Next, test for violations of our key Time Series Assumptions:
– Use the appropriate model to test for a unit root process. Does economic
theory suggest that your time series should exhibit a roughly linear time
trend? Justify your answer briefly, and explain what this means for the
model you use for this test and the hypotheses you test. Write a brief
explanation of the result, and what this means for your time series. If
you conclude your time series has a unit root, perform the necessary
transformation and add this model to your table.
– Use the appropriate test for a break in your time series where you don’t
know the exact date of the break. Write a brief explanation of the
result, and what this means for your time series. If you conclude your
time series has a break and you identify the likely break date, make the
necessary adjustment to your model and add this model to your table.
• Report estimated coefficients from both the AR(1) and AR(p) models in a
table, along with the coefficients from your modified model in the case that
your time series either has a break or a unit root.
• Is the coefficient on Yt−1 in your AR(1) significant? Write a brief explanation
of whether it is statistically significant, and an additional brief interpreta-
tion of the economics of this result. How about the coefficient Yt−1 in your
AR(p) model - is it similar? Discuss the implication of these results, and the
persistence of shocks. If you correct for a trend or a break, discuss how your
analysis of the non-transformed time series might be misleading.
1.2.2 Estimate an Autoregressive Distributed Lag Model
• Now we are going to introduce a second variable Xt. First, estimate an
ADL(1,1) model.
• Repeat the exercise you conducted above using the BIC to select the length
of lag, but now you will select a lag p to use for your ADL(p,p) model. For
simplicity, consider again up to p = 4 and use the same lag length for Yt and
• Use a Granger causality test to test whether the lags of your explanatory
variable Xt are jointly significant predictors of Yt. Report the test statistic
in the text (no need to add it to a table).
• Produce a table with your coefficient estimates from the ADL(1,1) model as
well as the ADL(p,p) model.
• Interpret the results from the above. Are the lags of Xt jointly predictive
of Yt in a model where we also include lags of Yt? Discuss the economic
significance of this result.
1.2.3 Check Out-Of-Sample Forecast Performance
• Using the Pseudo Out-Of-Sample forecasting method, with your ADL(1,1)
model and with the final 25% of your sample as your excluded sample, and
compare the within-sample SER (from the regression including none of your
excluded observations) and the out-of sample fit using your estimate of the
Root Mean Squared Forecast Error.
• Compare the size of the SER to the size of your RMSFE. Which is larger?
Does this suggest your forecast errors are larger, smaller, or the same as your
within-sample errors? Is your model capable of predicting out-of-sample?
1.3 Dynamic Causal Effects
• Use GLS to estimate the dynamic multipliers for a distributed lag model
regressing Yt on Xt and lags. For simplicity, use a Distributed Lag model
where r = 3, which means you will include Xt as well as the lag Xt−1 and
Xt−2, and an AR(1) error term, meaning that you model the error term just
as in lecture using φ1. Now, estimate these dynamic multipliers using the
Cohcrane-Orcutt method (not the Iterated Cochrane-Orcutt!).
• Discuss the results above, beginning with a short discussion of whether it is
reasonable to assume strict exogeneity or exogeneity and give an example of
something that would mean we can only assume exogeneity but not strict
exogeneity. For example, in lecture we considered crop prices as our outcome
Yt and climate shocks as our Xt. If people potentially stockpile crops today
based on anticipated climate shocks tomorrow then this would violate strict
exogeneity. Give an example of the issues with assuming strict exogeneity
in your setting. The important thing here is showing you understand the
conditions, so you can use a slightly unrealistic example here, as long as
you show you understand how to think of the exogeneity conditions in your
context. Next, discuss the implications of the dynamic multipliers you esti-
mate. Which dynamic multiplier of Xt is strongest? Does the effect increase,
decrease, or stay the same over time?
1.4 Cointegration
• Use the two-stage test for cointegration to test if your Yt and Xt are cointe-
grated. You can use the same Xt as above, or you can choose a different Xt
if you think they are more likely to be cointegrated and, therefore, a more
interesting exercise. (Note: even if it isn’t sensible to test for cointegration
here, conduct the test anyways, and interpret the result accordingly, and
explain why it isn’t appropriate to test for cointegration of these time series)
• Discuss the results from the above analysis. First of all, discuss whether it
makes sense in this case to test for cointegration.
1.5 Volatility Clustering Analysis
• Next, analyse whether the volatility of your time series is clustered, that is,
whether your time series exhibits greater variance at some times than others.
Estimate a GARCH(1,1) model on your data. Based on your earlier analysis,
decide whether it is appropriate to include an autoregressive component by
including an arma(1,0) term and explain your decision.
• Report estimated coefficients from this model in a table. Are any of the
coefficients significant? What does this mean about whether your time series
display conditional heteroskedasticity? Give a very brief interpretation of the
economics of this result: will a period of high volatility tend to be long-lived,
or will it be brief?
1.6 Conclusion
Finally, write a brief paragraph summarizing your findings. Again, this should just
be a brief summary of any of the results that give additional economic insights
relating to your outcome variable Yt or the relationship between Yt and Xt.