R代写-ETF2020-Assignment 1
时间:2021-09-01
ETF2020 Statistical Foundations of Business Analytics
Assignment 1 — Global Mean Sea Level
Important notes:
1. This is an individual assignment. This assignment is worth 25% of this unit’s total mark.
Marks will be deducted for late submission on the following basis: 20% for each day late,
up to a maximum of 3 days. Assignments more than 3 days late will not be marked.
2. Submission deadline for coursework is 1pm, 2nd September. Please submit a soft copy
and your R script through Moodle. Name the soft copy as follows: student ID Name.pdf
(or .doc). Pdf file is preferred, but word file is also fine. Name the R script in the same
fashion. Also, on the title page, please make sure you provide the student ID and name
correctly.
3. Please add sufficient description to your code, so the markers can read the code easily.
Penalty may occur without doing so. Notations used in the assignment need to be typed
correctly and properly. Incorrect notations will be treated as wrong answers.
In the case study, we investigate the global mean sea level. The interpretation of the dataset
to be studied can be found HERE. The following figure is copied from the website of NOAA.
We aim to provide more details for the global mean sea level using what we have covered so
far. Specifically, we consider the dataset “GMSL cleaned.csv” which is available at Moodle.
1
Questions:
1. How many years are covered in the dataset? Each row of the dataset represents one obser-
vation. Count the Absolute Frequency associated with each year. Report these numbers
and the corresponding Relative Frequencies in a table. Keep THREE decimals for the
Relative Frequencies. Plot the relative frequency of each year in a bar chart. [5 points]
2. What are the minimum and maximum values for each year? Report them in a table. What
is the range of the entire dataset? [5 points]
3. Focus on the maximum values from the second question above. Consider five probabilities
(0, 0.25, 0.5, 0.75, 1), and generate the corresponding quantiles. Report these quantiles in a
table, and then plot the empirical distribution function using these quantiles. [5 points]
4. Now, we have three students. Based on the data of “GMSL cleaned.csv”, each of them
wants to randomly pick the data from a year to look into. Who picks first does not matter,
and more than one person studying the data of the same year is allowed. How many
possible outcomes can we have? Write down the result, and verify our calculation with the
corresponding R code. [5 points]
5. Again, we have three students. Based on the data of “GMSL cleaned.csv”, each of them
wants to randomly pick the data from a year to look into. Who picks first does not matter,
but they have to study three different years. How many possible outcomes can we have?
Write down the result, and verify our calculation with the corresponding R code. [5 points]
Hints:
1. Note that there are missing values in the csv file.
2. “round()” might be useful when formatting decimal places.
2