R代写---MAT022 Resit Course work2021
时间:2021-07-23

MAT022 Foundations of Statistics and Data Science

Resit Coursework 2020/21

Summative re-assessment for the module is by means of a single report on your statistical analysis of a data set related to the National Basketball Association (NBA), a professional basketball league in the USA. Please read this document carefully.

Important. The data set is NOT the same as that provided for the original coursework.

This form of assessment has been chosen because as professional statisticians and data scientists, you will often be asked to investigate a data set and report on whether it contains anything useful or interesting. The assessment will also help you to prepare for writing your MSc dissertation in the summer.

Your report will be assessed according to how well you are able to

• analyse the data set, 40%

• interpret the results of your analysis, and 30%

 • present the results of your analysis and your interpretation of the data set. 30%

Your analysis should be performed using the R statistical software package, and your report prepared using the R Markdown typesetting system and the template provided. Two marks will be deducted for reports prepared using alternative systems such as Microsoft Word or LaTeX.

Please submit your report in PDF format,via Learning Central sometime before 12.00 on Thursday 12 August 2021.

1 The data

The data set NBA_sample.csv is a partial record of shots taken by players in the NBA between October 2014 and March 2015, and consists of 50,000 observations on 20 variables as described in Table 1. A summary of the changes made to the data set provided for the original coursework can be found at the end of this document.



2 The report

The ability to write clearly and concisely is an important professional competence. To encourage writing that is brief and to the point, your reports are limited to a maximum of 10 pages. It is often far more difficult to express yourself in 100 words than in 1000 words, especially when you have a lot to say, so be careful not to underestimate the challenge posed by this restriction. The modest page limit will also encourage you to be selective in the results you choose to present. A suggested structure for your report is shown in Table 3. Note that the title page, abstract, table of contents, list of references and appendix do not contribute towards the page count.


• The title page should contain the title of your report, your name and student number, and the date on which your report was completed.

• The abstract should contain a short summary of the report and its main conclusions.

 • The table of contents should list the number and title of each section against the number of the page on which the section begins.

 • The introduction should consist of a few short paragraphs, describing the purpose of the report and providing a brief outline of its contents. 

• The background section should include a brief review of any relevant literature, and provide a context for the work presented in the report. 

• The report should contain a relatively short section on a descriptive analysis of the data set, with a title chosen to reflect what the section contains. 

• The main part of the report should consist of two or three sections on different inferential analyses of the data set. Here you should formulate hypotheses, conduct statistical tests, then present and discuss the results of these tests. The titles of these sections should reflect what the sections contain. 

• The conclusion should consist of a few short paragraphs, providing a summary of the report and a brief outline of some ideas for future work.

 • The report may contain a single appendix for large figures and tables, limited to a maximum of two pages.

3 Assessment criteria

Detailed assessment criteria are shown in Table 4.


4 Guidelines for writing reports

The golden rule when writing is to always think of the reader. For scientific reports, readers will typically want to read something interesting and learn something in the process.


Audience. The target audience for your report is this year’s cohort students on the Founda- tions of Statistics and Data Science module, so you can assume that your readers are familiar with the methods and terminology established within the lectures and notebooks. If you choose to use methods that have not been covered in lectures, you must ensure that any new terms are properly defined and references to the relevant literature included.

Analysis. The reader shoud be satisfied that you have performed your analysis correctly, and in particular that you have verified the conditions that are necessary to apply the various methods. Your methods should be introduced with a brief summary of their main features, but technical details should not be discussed at length although you might consider providing the interested reader with references to the relevant literature. 

Navigation. Do not assume that the reader will read the report from start to finish, as one might read a novel. Reports should be made easy to navigate using numbered sections and subsections together with cross-referencing. Once you have written a first draft, it will need careful editing before it becomes a coherent and polished report. This stage always takes longer than you think! 

Scientific writing. For scientific reports we aim for a style of writing that is clear and concise. Make sure that sentences are unambiguous and that a good standard of writing is maintained throughout the report.

• Sections should not start abruptly with the subject matter, but rather with an introductory sentence or short paragraph. Sections should also end with concluding sentence or short paragraph. 

• All figures and tables must be numbered and have captions. Figures or tables that are not mentioned at least once in the text should not be included. 

• A qualified statement is one that express some level of uncertainty about its own accuracy, and should always be used when drawing conclusions from the results of a statistical analysis, and especially when speculating about possible causal factors. Common phrases that indicate qualified statements include “This suggests that ...”, “It appears that ...”, “We might conclude that ...”, “There is some evidence to indicate ...” and so on.

5 Summary of Changes

Here is a summary of how the data set for the resit coursework (NBA_sample.csv) was created from the original NBA data set.

Variables 

1. FINAL_MARGIN, SHOT_RESULT and PTS were removed. 

2. GAME_CLOCK (character type) was replaced by SEC_REMAIN (integer type).

 3. W was renamed WIN_LOSE. 

4. FGM was renamed SUCCESS.

Cleaning 

1. Misspelled names in PLAYER_NAME were corrected. 

2. Missing data in SHOT_CLOCK were corrected or removed.

 3. Negative values in TOUCH_TIME were removed. 

4. Outliers in PTS_TYPE were removed: 

• 2-pointers with SHOT_DIST larger than 23.75 feet; 

• 3-pointers with SHOT_DIST smaller than 22 feet.

Sampling 

1. Entries corresponding to shots taken during overtime periods were removed. 

2. Entries corresponding to the game PHI vs GSW (Feb 9, 2015) were removed.

 The modified data set contained 123,257 entries, compared with 128,069 for the original data set. A random sample of 50,000 entries was then chosen uniformly and without replacement from the modified data set, to create the data set (NBA_sample.csv) for this assignment.



学霸联盟


essay、essay代写