matlab代写-CIVE50008
时间:2022-03-17
Coursework: Statistics CIVE50008
Deadline: 21/03/2022 – online submission of a single Matlab script
The objective of this coursework is to provide a statistical analysis of
measurements of temperature and rainfall. Understanding of both variables is
important for hydrological applications (in particular drought management and
flood estimation): while the rainfall is the direct input to hydrological models,
temperature is used to estimate actual evaporation.
The dataset comprises of 5 variables in an .xlsx file called: “dataset.xlsx”. The
day, month and year of the measurements can be found in columns A, B, and C,
respectively. Column D contains the temperature measurements (in degrees
Celsius) and Column F contains the rainfall measurements (in mm). All these
measurements correspond to values recorded at 8pm in geographically close
locations.
Use the template file called: “Surname_FirstName_CIDnumber.m” to perform the
statistical analysis. Rename this template using your information to submit. All
your lines of Matlab code should be gathered into one Matlab script so that
when it is run, the answers appear in the command window, and the required
figures are generated. Indicate the question you are answering using a comment
line before the code addressing the question. If you need to obtain answers using
a Matlab GUI please put the answers in comment lines. When you are asked to
comment on a result, please write your answer in up to five comment lines.
Coursework tasks
1. Import the data and create an array (X) that contains three columns: Month,
temperature and precipitation on the days when measurements are
available. There are days with missing measurements; these are indicated by -
99.0 in the data set. If one measurement is missing on one day, all
measurements on that day should be ignored. The recorded measurements will
be assumed, for the rest of the coursework, to be genuine i.e. not erroneous.
2. Populate an array Y with the same information as X but concerning the days
with non-zero rainfall.
3. Calculate the measures of mean, mode and median for the temperature and
rainfall data (both for the full-series and non-zero series).
4. Calculate the variance, standard deviation, coefficient of variation, mean
absolute deviation of all variables. Briefly compare and comment on the
observed dispersion and the differences between the two series.
5. Plot the Cumulative Distribution Function (CDF) for each variable (full-series).
6. Produce a plot that indicates if the temperature and rainfall data (full-series)
are skewed. Briefly describe how this is shown in the plot.
7. Explore appropriate distributions to approximate the temperature and rainfall
data in the full series using q-q plots.
8. Produce histograms with variable number of bins for the temperature data
(full series), and the non-zero rainfall data. Use 5, 10, 20 and 30 bins and
comment on the differences you observe based on the bin size.
9. Consider the non-zero rainfall series. Fit (continuous) distributions to the
temperature and rainfall data and investigate which one best provides the best
fit. Produce the relevant figures and use an appropriate metric to justify your
selection of the best-fit distribution. Which is the simplest distribution
appropriate to each variable?
10. For the best-fit distributions you identified in (9), use both the Method of
Moments and the Maximum Likelihood Method to obtain the parameters of each
distribution (one distribution for each variable). Comment on the results.
11. Using the 2 distributions obtained in (10) and their parameters, generate an
appropriate amount of synthetic data series. Use these synthetic data to explore
the validity of the law of large numbers and the central limit theorem.
12. Comment on the corelation of the two variables (temperature – rainfall).
Compare your findings for the full-series and the non-zero rainfall series.
13. Physical knowledge of the processes suggests that temperature should
resemble a Normal distribution and non-zero rainfall should follow a Gamma
distribution. Assume that this is indeed the case and that the two variables are
independent. Construct a bivariate distribution to describe the joint PDF of
temperature and non-zero rainfall. Create a surface plot of the joint PDF using
appropriate bounds and parameters.
14. Give a 90% confidence interval for the population mean and a 90%
confidence interval for the population standard deviation for the temperature
data (full series).
15. Test with 1% significance the hypothesis that the normal distribution is a
good fit for the temperatures (full series). Perform the test using two different
methods and indicate the main difference between the two approaches.
16. Test with 90% confidence the hypothesis that daily August rainfall depths
that are larger than 1 mm are from the same population as daily March rainfall
depths that are larger than 1mm, by doing a test comparing the means.
Comment on the test and its results.


essay、essay代写