MATH1062: -无代写-Assignment 2|学霸联盟

MATH1062: -无代写-Assignment 2

时间：2025-05-10

The University of Sydney
School of Mathematics and Statistics
Assignment 2 Part B: Statistics
MATH1062: Mathematics 1B Semester 1, 2025
Lecturers: Tiangang Cui and Jun Yong Park
This individual assignment is due by 11:59pm Sunday 11 May 2025, via Canvas.
Late assignments will receive a penalty of 5% per day until the closing date. Your
answers must be uploaded in Canvas following the instruction at the beginning of
each Part of this document. Please make sure you review your submission carefully.
What you see is exactly how the marker will see your assignment. Submissions can be
overwritten until the due date. To ensure compliance with our anonymous marking
obligations, please do not under any circumstances include your name in any area of
your assignment. The School of Mathematics and Statistics encourages some collabo-
ration between students when working on problems, but students must write up and
submit their own version of the solutions. Even though the use of AI is allowed, it
is better for your learning to do your own work to complete the assignment. If you
have technical difficulties with your submission, see the University of Sydney Canvas
Guide, available from the Help section of Canvas.
This assignment has two parts. It is worth a total of 5% + 5% = 10% of your final assessment
for this unit. Please cite any resources used, including AI, and show all working. Present your
arguments clearly using words of explanation and diagrams where relevant. The marker will
give you feedback and allocate an overall mark to your assignment using the following criteria:
Copyright © 2025 The University of Sydney 1
Part B: Statistics
Solutions to the statistics part (Part B) must be prepared in written form, and uploaded as
a single pdf file to https://canvas.sydney.edu.au/courses/64063/assignments/598720.
The statistics part of Assignment 2 contains three questions, each with multiple parts.
Background
In this assignment, we will use simulated climate data based on real observations from the
Bureau of Meteorology at Canterbury Racecourse AWS (station 066194), collected in 2023.
The simulated dataset includes various daily measurements over a period of 117 days.
For this assignment, we will focus on the daily morning temperature (morning.temp), daily
maximum temperature (max.temp), and daily relative humidity (humidity). Boxplots of these
variables are shown below.
Temperatures are measured in degrees (Celsius), and relative humidity is expressed as a per-
centage (taking values betweeen 0 and 100). To avoid confusion, please focus on the given data
values rather than their units in answering the following questions.
1. (a) Using the following R output, write down step-by-step the equation of the linear re-
gression model to predict the value for max.temp given the value of morning.temp.
Round the slope and intercept to two decimal places.
> summary(morning.temp)
Min. 1st Qu. Median Mean 3rd Qu. Max.
17.10 19.98 21.00 21.10 21.93 25.70
> summary(max.temp)
Min. 1st Qu. Median Mean 3rd Qu. Max.
20.52 25.75 26.99 27.37 29.02 33.87
> sd(morning.temp)
[1] 2.218058
2
> sd(max.temp)
[1] 2.875926
> cor(max.temp, morning.temp)
[1] 0.9386172
(b) The plot below shows the residuals after fitting the regression line in (a). Comment
on whether the regression line is a good fit.
2. A relative humidity above 65 is often considered high and may pose potential health
risks.
Using historical data from the early 1900s, researchers at the Bureau of Meteorology
estimated that 18% of days had relative humidity exceeding 65. They claimed that the
current proportion of days with risky humidity levels (i.e., relative humidity exceeding
65) is consistent with the level observed in the early 1900s.
We want to test whether the data provided in this assignment is consistent with their
claim that “18% of days have relative humidity exceeding 65”. The following R output
is useful.
sum(humidity> 65)
[1] 30
> length(humidity)
[1] 117
> round(qnorm(c(0.95, 0.955, 0.96, 0.965, 0.97)), 3)
[1] 1.645 1.695 1.751 1.812 1.881
> round(qnorm(c(0.975, 0.98, 0.985, 0.99, 0.995)), 3)
[1] 1.960 2.054 2.170 2.326 2.576
(a) State the null and alternative hypotheses for this test. In answering, introduce
an appropriate parameter, as well as state your null and alternative hypotheses in
terms of this parameter.
(b) Calculate the expected value and standard error of the sample proportion, assum-
ing the null hypothesis is true. Round your calculations to three decimal places.
3
(c) Assuming the Central Limit Theorem holds, calculate the two-sided 98% prediction
interval that can be used to test whether the data is consistent with the claimed
18% in the null hypothesis. You can use the provided R output. Round your
calculations to three decimal places.
(d) First, use the provided R output to calculate the observed sample proportion, and
then compute the P-value based on this observed proportion. You may need to
use the pnorm() function in R to calculate the P-value. You must include the R
command and its output in your submission, either as a screenshot or written by
hand.
(e) What is the conclusion of your hypothesis test at the 2% significance level? Is the
observed sample proportion significantly different from 18%? What assumptions
do we need about our data to make our hypothesis test valid? Your answer should
have three things:
1. At most one sentence stating the conclusion of your hypothesis test.
2. State a reason for your conclusion. At most two sentences.
3. At most two sentences explaining what assumptions we used during our hy-
pothesis test.
(f) Without calculating the actual confidence interval, will the 98% confidence interval
for the observed sample proportion cover the claimed 18%? You should provide a
Yes/No answer and justify your response.
3. Suppose we now want to determine whether the current proportion of days with risky
humidity levels (i.e., relative humidity exceeding 65) is higher than the level observed in
the early 1900s. How would you formulate the hypothesis test, and what would be the
conclusion?
We will use the same R output from Question 2 to answer this question.
(a) State the null and alternative hypotheses for this test. You may use the same
parameter defined in Question 2. In addition, state which values of the test statistic
argue against the null hypothesis.
(b) Use your calculated P-value in Q2(d). Calculate the P-value for this new test.
What is the conclusion of your hypothesis test at the 2% significance level?
4

学霸联盟