xuebaunion@vip.163.com

3551 Trousdale Rkwy, University Park, Los Angeles, CA

留学生论文指导和课程辅导

无忧GPA：https://www.essaygpa.com

工作时间：全年无休-早上8点到凌晨3点

微信客服：xiaoxionga100

微信客服：ITCS521

R代写-MSCI 570

时间：2021-01-15

2020/2021 MSCI 570 Forecasting Coursework part 2

Coursework Information & Submission

This is the second of two individual assignments. The first assignment, weighted 40%, required you to explore a

dataset of a single time series individually assigned to you in order to understand the data as a preparation for the

second assignment. This second individual assignment is weighted 60% and you may find it useful to apply insights

from the first assignment to the second one. In this second assignment, you will forecast the same time series you

analysed in assignment 1 with multiple forecasting algorithms and models.

Coursework deadline is January 18, 2021, 10:00 am.

Standard departmental penalties will apply for late work unless you have been given an extension for exceptional

reasons from the course administrator. All submissions will be checked by the plagiarism software. Coursework must

be submitted online on Moodle. Submit your report PLUS all R scripts in the appendix in Moodle.

Assignment: Forecast Model Building for a real-world time series

Your task is to forecast and critically discuss the fit of your models based on the pattern(s) of a real-world time series.

Document your findings comprehensively in a technical report, making adequate use of (readable and correctly

labelled) graphs which you also critically discuss to support your arguments. Base your justification on evidence and

document your iterative model building process, possibly transforming the time series and analysing the resulting

patterns throughout. Your objective is to:

a) Develop the most accurate statistical forecasting models, demonstrating your modelling skills!

• Build multiple potentially suitable forecasting models, including a suitable Exponential Smoothing, ARIMA,

and Time Series Regression model to predict the 14 next values (2 weeks ahead)

• Choose what you assume to be the "best" model from these models for a final submission, by assessing

errors and comparing errors against a Naive and a Seasonal Naive benchmark models.

b) Document your forecasting process in a technical research report

• Write a technical report to document your model building skills & to justify your choice of model

• Critically discuss some findings and choices

Marking Scheme & Hints

60 % of points – Forecast Model Building.

• Build multiple potential contender models (each one suitable for the identified data properties), leading to (at

least)

o 1 good manually built Exponential Smoothing Model - in comparison to 1 automatically built

o 1 good manually built ARIMA model - in comparison to 1 automatically built

o 1 good manually built Time Series Regression model - in comparison to 1 automatically built

• Document the specification of each model

o Document the specification process of model building, i.e. the iterative steps to build this final model,

including the analysis of residuals of intermediate models that have led to better ones

o Document the final model form and parameters to allow a complete replication of your experiments

(i.e. specify what model forms or lags you selected, which transformations in which order & which

parameters were used so that others could replicate what you did exactly only from your document).

• NOTE: For each contender model, one or more different model forms may be feasible to produce forecasts

(i.e. for a seasonal time series with slight trend in Exponential Smoothing you may use multiplicative or

additive seasonality, with or without trend, so 4 candidates that could perform well; for ARIMA you may

consider models with seasonal differences or first differences or both; for regression models you may consider

to capture seasonality as dummy variables, or as a seasonal autoregressive lag etc.). Where applicable, you

should always consider multiple plausible candidate models, and must justify your choice of candidates in

comparison to other potential models in each class of models (feel free to explicitly rule out implausible ones).

To get high marks it will not be sufficient to build a single Exponential Smoothing model using an auto

specification ZZZ or a single automatic ARIMA model and a single Regression model with stepwise, but rather

require the development of a subset of potentially useful models which you compare and accept/reject. Base

your justification on evidence and document your ITERATIVE modelling process throughout.

20% of points – Discussion of expected Accuracy (Errors)

• Determine suitable error metrics and compute the expected in-sample and out-of-sample errors of your

recommended models and compare them. Comment on the suitability of the chosen metrics for this task.

• Select one "best" model to be used for forecasting the time series across all algorithm families, and provide

your final out of sample forecasts.

• NOTE:

o To assess the future forecast accuracy of your models before the final future values become

available, consider to create a hold-out dataset of equal or better larger size to the forecasting horizon

and assess both in-sample and (quasi) out-of-sample errors

o Different error metrics are feasible. You should use at least two suitable error metrics for the

assessment, and justify the use of each error metrics you are using

o To show improvements in accuracy of your methods to some objective benchmark, you should

compare all to the Naive Level and a Naive Seasonal Model. Use tables to provide a suitable

overview of different methods’ accuracy and their uplift of accuracy on the 2 Naïve models.

o Comment on the accuracy and suitability of each of the methods for each of the time series. Which

method would you recommend to use for each of the time series? Comment on why you think some

methods perform better than others? Comment on the available data and number of origins for

forecast evaluations on the withheld test set and given your fixed forecasting horizon.

10% of points – Conclusions

Conclude by recommending one (or multiple) suitable algorithm(s) and forecasting model form(s), and critically

discuss your choice weighting the different options. As time series patterns are not always clear, there often are

multiple suitable forecasting models for a time series. Please recommend all that are suitable.

10% of points - General report writing skills

General report writing skills include a critical discussion of findings, thoroughness of documentation, clarity of

arguments, structure of the report, readability of the report (i.e. lack of spelling and grammatical mistakes etc.) in

marking each section. Please see next page for some more technical considerations on report writing.

SUM 100%

We highly recommend using R, but you are free to use any external software but report the software used.

Please also consider the general recommendations on writing a technical report on the next page!

General suggestions on writing a report

The coursework requires you to document your analysis and critically discuss your chosen experimental design,

modelling approaches and the results in a technical report. This technical report should be written as if tailored to an

Analytics specialist (e.g. who has an MSc from Lancaster University and has taken the MSCI750 course, and who

wants to evaluate your results AND your decision making process to determine your skills in modelling and whether

you have missed anything). This means that you are not required to write a general description (i.e. a statistical test is,

the ACF function is, Exponential Smoothing is ... ) as an Analytics expert would be aware of this! Consequently, the

report should document the process of modelling, and allow an understanding of your choices and a replication of

your experiments.

The report should contain an introduction and a summary with conclusions on your findings, numbered headings, list

of figures and tables and an executive summary (tailored to senior management) indicating the most relevant findings.

The report should display a logical and concise structure, be generally “readable” and support your argument using

plots of time series, forecasts and /or accuracy. Make adequate use of graphs to show time series, model fit /

predictions and residuals to support your arguments (for this graphs must be completely readable and with labels), as

well as tables to compare results.

The page limit for the report is 10 pages (note this is a maximum to make your life easier - you can produce shorter

reports! pages count only for main text incl. graphs and tables, but not for the cover sheet, executive summary,

contents sheet or appendices). Reports of excessive length will be penalised by deducting 10 marks (i.e. 10% of 100)

but only if they are including un-necessary material. For formatting, use single spacing, format normal text in times

new roman font size 12, text in tables, figure and table headings in font size 10, and leave 2cm of margin left and right.

Excessive evidence (e.g. the complete information from statistical tests) may be placed in the appendix, but must be

referenced directly at the corresponding place in the main the text, else it is not taken into consideration. Include any

technical details and hardcopies that support your arguments in a set of appendices (i.e. the printouts from ADF tests

in the appendix, with only the conclusion of significance / insignificance at a probability in the main text), which will not

count towards the page limit. You must ensure the main text is readable and that your argument is coherent without

needing to consult the appendices. All parts of the text supported by an appendix must cross-reference directly to the

relevant part.

Non-disclosure clause: these datasets and the coursework task is subject to copyright © by Sven Crone, all

rights reserved. In downloading the documents and submitting the assignment for assessment the copyright

agreement is deemed accepted. Any publication of the dataset, the coursework task, or its solution (e.g. on a

coursework website or a social network site), or a part thereof, will be considered a violation of copyright. The

person breaking the copyright may be held liable for damages by international law suit. Furthermore, the

publication will count the assignment as a plagiarism - even in retrospect after receiving the MSc degree -

leading to a mark of zero, with the usual right to appeal to university court in official hearing.

Contact details:

Questions regarding the coursework

Sven F. Crone

Room A53a

s.crone@lancaster.ac.uk

Questions regarding R and workshops:

Anna-Lena Sachs

Room 55

a.sachs@lancaster.ac.uk

If you have any questions please don’t hesitate to contact us! Also consider in your enquiries that I cannot always

react within a few hours, so don't leave questions to the last minute … start early!

Best of luck!

Anna-Lena & Sven

Coursework Information & Submission

This is the second of two individual assignments. The first assignment, weighted 40%, required you to explore a

dataset of a single time series individually assigned to you in order to understand the data as a preparation for the

second assignment. This second individual assignment is weighted 60% and you may find it useful to apply insights

from the first assignment to the second one. In this second assignment, you will forecast the same time series you

analysed in assignment 1 with multiple forecasting algorithms and models.

Coursework deadline is January 18, 2021, 10:00 am.

Standard departmental penalties will apply for late work unless you have been given an extension for exceptional

reasons from the course administrator. All submissions will be checked by the plagiarism software. Coursework must

be submitted online on Moodle. Submit your report PLUS all R scripts in the appendix in Moodle.

Assignment: Forecast Model Building for a real-world time series

Your task is to forecast and critically discuss the fit of your models based on the pattern(s) of a real-world time series.

Document your findings comprehensively in a technical report, making adequate use of (readable and correctly

labelled) graphs which you also critically discuss to support your arguments. Base your justification on evidence and

document your iterative model building process, possibly transforming the time series and analysing the resulting

patterns throughout. Your objective is to:

a) Develop the most accurate statistical forecasting models, demonstrating your modelling skills!

• Build multiple potentially suitable forecasting models, including a suitable Exponential Smoothing, ARIMA,

and Time Series Regression model to predict the 14 next values (2 weeks ahead)

• Choose what you assume to be the "best" model from these models for a final submission, by assessing

errors and comparing errors against a Naive and a Seasonal Naive benchmark models.

b) Document your forecasting process in a technical research report

• Write a technical report to document your model building skills & to justify your choice of model

• Critically discuss some findings and choices

Marking Scheme & Hints

60 % of points – Forecast Model Building.

• Build multiple potential contender models (each one suitable for the identified data properties), leading to (at

least)

o 1 good manually built Exponential Smoothing Model - in comparison to 1 automatically built

o 1 good manually built ARIMA model - in comparison to 1 automatically built

o 1 good manually built Time Series Regression model - in comparison to 1 automatically built

• Document the specification of each model

o Document the specification process of model building, i.e. the iterative steps to build this final model,

including the analysis of residuals of intermediate models that have led to better ones

o Document the final model form and parameters to allow a complete replication of your experiments

(i.e. specify what model forms or lags you selected, which transformations in which order & which

parameters were used so that others could replicate what you did exactly only from your document).

• NOTE: For each contender model, one or more different model forms may be feasible to produce forecasts

(i.e. for a seasonal time series with slight trend in Exponential Smoothing you may use multiplicative or

additive seasonality, with or without trend, so 4 candidates that could perform well; for ARIMA you may

consider models with seasonal differences or first differences or both; for regression models you may consider

to capture seasonality as dummy variables, or as a seasonal autoregressive lag etc.). Where applicable, you

should always consider multiple plausible candidate models, and must justify your choice of candidates in

comparison to other potential models in each class of models (feel free to explicitly rule out implausible ones).

To get high marks it will not be sufficient to build a single Exponential Smoothing model using an auto

specification ZZZ or a single automatic ARIMA model and a single Regression model with stepwise, but rather

require the development of a subset of potentially useful models which you compare and accept/reject. Base

your justification on evidence and document your ITERATIVE modelling process throughout.

20% of points – Discussion of expected Accuracy (Errors)

• Determine suitable error metrics and compute the expected in-sample and out-of-sample errors of your

recommended models and compare them. Comment on the suitability of the chosen metrics for this task.

• Select one "best" model to be used for forecasting the time series across all algorithm families, and provide

your final out of sample forecasts.

• NOTE:

o To assess the future forecast accuracy of your models before the final future values become

available, consider to create a hold-out dataset of equal or better larger size to the forecasting horizon

and assess both in-sample and (quasi) out-of-sample errors

o Different error metrics are feasible. You should use at least two suitable error metrics for the

assessment, and justify the use of each error metrics you are using

o To show improvements in accuracy of your methods to some objective benchmark, you should

compare all to the Naive Level and a Naive Seasonal Model. Use tables to provide a suitable

overview of different methods’ accuracy and their uplift of accuracy on the 2 Naïve models.

o Comment on the accuracy and suitability of each of the methods for each of the time series. Which

method would you recommend to use for each of the time series? Comment on why you think some

methods perform better than others? Comment on the available data and number of origins for

forecast evaluations on the withheld test set and given your fixed forecasting horizon.

10% of points – Conclusions

Conclude by recommending one (or multiple) suitable algorithm(s) and forecasting model form(s), and critically

discuss your choice weighting the different options. As time series patterns are not always clear, there often are

multiple suitable forecasting models for a time series. Please recommend all that are suitable.

10% of points - General report writing skills

General report writing skills include a critical discussion of findings, thoroughness of documentation, clarity of

arguments, structure of the report, readability of the report (i.e. lack of spelling and grammatical mistakes etc.) in

marking each section. Please see next page for some more technical considerations on report writing.

SUM 100%

We highly recommend using R, but you are free to use any external software but report the software used.

Please also consider the general recommendations on writing a technical report on the next page!

General suggestions on writing a report

The coursework requires you to document your analysis and critically discuss your chosen experimental design,

modelling approaches and the results in a technical report. This technical report should be written as if tailored to an

Analytics specialist (e.g. who has an MSc from Lancaster University and has taken the MSCI750 course, and who

wants to evaluate your results AND your decision making process to determine your skills in modelling and whether

you have missed anything). This means that you are not required to write a general description (i.e. a statistical test is,

the ACF function is, Exponential Smoothing is ... ) as an Analytics expert would be aware of this! Consequently, the

report should document the process of modelling, and allow an understanding of your choices and a replication of

your experiments.

The report should contain an introduction and a summary with conclusions on your findings, numbered headings, list

of figures and tables and an executive summary (tailored to senior management) indicating the most relevant findings.

The report should display a logical and concise structure, be generally “readable” and support your argument using

plots of time series, forecasts and /or accuracy. Make adequate use of graphs to show time series, model fit /

predictions and residuals to support your arguments (for this graphs must be completely readable and with labels), as

well as tables to compare results.

The page limit for the report is 10 pages (note this is a maximum to make your life easier - you can produce shorter

reports! pages count only for main text incl. graphs and tables, but not for the cover sheet, executive summary,

contents sheet or appendices). Reports of excessive length will be penalised by deducting 10 marks (i.e. 10% of 100)

but only if they are including un-necessary material. For formatting, use single spacing, format normal text in times

new roman font size 12, text in tables, figure and table headings in font size 10, and leave 2cm of margin left and right.

Excessive evidence (e.g. the complete information from statistical tests) may be placed in the appendix, but must be

referenced directly at the corresponding place in the main the text, else it is not taken into consideration. Include any

technical details and hardcopies that support your arguments in a set of appendices (i.e. the printouts from ADF tests

in the appendix, with only the conclusion of significance / insignificance at a probability in the main text), which will not

count towards the page limit. You must ensure the main text is readable and that your argument is coherent without

needing to consult the appendices. All parts of the text supported by an appendix must cross-reference directly to the

relevant part.

Non-disclosure clause: these datasets and the coursework task is subject to copyright © by Sven Crone, all

rights reserved. In downloading the documents and submitting the assignment for assessment the copyright

agreement is deemed accepted. Any publication of the dataset, the coursework task, or its solution (e.g. on a

coursework website or a social network site), or a part thereof, will be considered a violation of copyright. The

person breaking the copyright may be held liable for damages by international law suit. Furthermore, the

publication will count the assignment as a plagiarism - even in retrospect after receiving the MSc degree -

leading to a mark of zero, with the usual right to appeal to university court in official hearing.

Contact details:

Questions regarding the coursework

Sven F. Crone

Room A53a

s.crone@lancaster.ac.uk

Questions regarding R and workshops:

Anna-Lena Sachs

Room 55

a.sachs@lancaster.ac.uk

If you have any questions please don’t hesitate to contact us! Also consider in your enquiries that I cannot always

react within a few hours, so don't leave questions to the last minute … start early!

Best of luck!

Anna-Lena & Sven