R代写-MATH3821
时间:2022-07-10
UNIVERSITY OF NEW SOUTH WALES
SCHOOL OF MATHEMATICS AND STATISTICS
MATH3821 Statistical Modelling and Computing
Midsession Test, T2 2021
Time allowed 60 Minutes
Total number of questions 2
Where the question is a multiple choice, place your selection (a, b,...) in the answer box provided. You
can choose more than one.
Where th question is not a multiple choice, provide the answer in the form specified in the answer box.
When you finish, you may download the notebook and submit as .ipynb file, or a pdf file.
I have provided space at the end of the test for you to provide your answers in a concise format.
Student Name꞉
Student ID꞉
Question 1 [Total꞉ 9 marks, 3 parts]
Let be a random variable with values in (or or ), and suppose its probability distribution
depends on a single parameter . Then it belongs to the exponential family if it admits the form
for some (measurable) functions and .
Another way of writing this is
where and .
Identify if the following probability density functions belong to the exponential family. If yes, is the
distribution of canonical and what is the natural parameter of this distribution.
a). [3 marks] For the Pareto distribution . Which of the following statement(s) are true.
In [ ]:
In [ ]:
In [ ]:
In [ ]:
Y R N
0
Z f(y; θ)
θ
f(y; θ) = s(y)t(θ)e
a(y)b(θ)
a, b, s, t
f(y; θ) = exp[a(y)b(θ) + c(θ) + d(y)]
s(y) = e
d(y)
t(θ) = e
c(θ)
Y
f(y; θ) = θy
−θ−1
(a) Does not belong to the exponential family of distributions;
(b) Belongs to the exponential family of distributions and is of cannonical form with natural parameter꞉
;
(c) Belongs to the exponential family of distributions but is not of cannonical form;
(d) Belongs to the exponential family of distributions and is of cannonical form with natural parameter꞉
.
Answer꞉
b). [3 marks] For the Exponential distribution꞉ . Which of the following statement(s) are
true.
(a) Does not belong to the exponential family of distributions;
(b) Belongs to the exponential family of distributions and is of cannonical form with natural parameter꞉
;
(c) Belongs to the exponential family of distributions but is not of cannonical form;
(d) Belongs to the exponential family of distributions and is of cannonical form with natural parameter꞉ ;
Answer꞉
c). [3 marks] For the Negative binomial distribution ( is known)꞉ . Which of
the following statement(s) are true.
(a) Does not belong to the exponential family of distributions;
(b) Belongs to the exponential family of distributions and is of cannonical form with natural parameter꞉
;
(c) Belongs to the exponential family of distributions but is not of cannonical form;
(d) Belongs to the exponential family of distributions and is of cannonical form with natural parameter꞉
.
Y
Y
−(θ+ 1)
Y
Y
−2(θ+ 1)
In [ ]:
In [ ]:
In [ ]:
f(y; θ) = θe
−yθ
Y
Y
−θ
Y
Y θ
In [ ]:
In [ ]:
In [ ]:
r f(y; θ) = ( )θ
r
(1 − θ)
y
y+r−1
r−1
log(1 − θ)
1 − θ
Answer꞉
Question 2 [Total꞉ 14 marks, 7 parts]
The data below are times to death, , in weeks from diagnosis and (initial white blood cell count), ,
for seventeen patients suffering from leukemia.
a). [1 mark] From an appropriate visualisation of the data, do the data show any trend?
(a) no trend in the data;
(b) when increases also increases approximately linearly;
(c) when increases decreases approximately exponentially.
Answer꞉
b). [2 marks] A possible specification for is
which will ensure that is non‑negative for all values of the parameters and all values of . Which link
function is appropriate in this case?
(a) ;
(b) ;
(c) identity function.
In [ ]:
In [ ]:
In [ ]:
In [ ]:
y
i
log
10
x
i
In [1]: y=c(65,156,100,134,16,108,121,4,39,143,56,26,22,1,1,5,65)
x=c(3.36,2.88,3.63,3.41,3.78,4.02,4.00,4.23,3.73,3.85,3.97,4.51,4.54,5.00,5.00,4.72
,5.00)
x y
x y
In [ ]:
In [ ]:
In [ ]:
E(Y )
E(Y
i
) = exp(β
0
+ β
1
x
i
)
E(Y ) x
exp
log
Answer꞉
c). [2 marks] The exponential distribution is often used to describe survival times. The probability
distribution is
You would like to fit a model with the equation for given by
and the exponential distribution using glm() in R. What is the family object set to in order to fit such model?
(a)
family=Gamma(link="log");
(b)
family=Gamma;
(c)
family=gaussian(link="exp");
(d)
family=Gamma(link="exp");
(e)
family=binomial(link="log");
(f)
family=gaussian(link="log").
As a hint see the following extract from the object family documentation in R꞉
family(object, ...)
binomial(link = "logit")
gaussian(link = "identity")
Gamma(link = "inverse")
inverse.gaussian(link = "1/mu^2")
poisson(link = "log")
quasi(link = "identity", variance = "constant")
quasibinomial(link = "logit")
quasipoisson(link = "log")
Answer꞉
In [ ]:
In [ ]:
In [ ]:
f(y; θ) = θ exp(−yθ).
E(Y
i
)
E(Y
i
) = exp(β
0
+ β
1
x
i
)
In [ ]:
In [ ]:
In [ ]:
d). [2 marks] You have now fitted the model with the equation for given by
and the exponential distribution using glm() in R and obtained the following result꞉
Error
in
parse(text
=
x,
srcfile
=
src):
:2:10:
unexpected
symbol
1:
2:
Deviance
Residuals
^
Traceback:
What are the parameter estimates and are they significant assuming that ?
(a) and and they are both significant;
(b) and and they are both significant;
(c) and and is not significant;
(d) and and they are both significant;
(e) and and is not significant.
Answer꞉
e). [2 marks] What is the 95\% confidence interval for the parameter in the model
E(Y
i
)
E(Y
i
) = exp(β
0
+ β
1
x
i
)
In [2]: Deviance
Residuals:
Min
1Q
Median
3Q
Max
‑1.9922
‑1.2102
‑0.2242
0.2102
1.5646
Coefficients:
Estimate
Std.
Error
t
value
Pr(>|t|)
(Intercept)
8.4775
1.6034
5.287
9.13e‑05
***
x
‑1.1093
0.3872
‑2.865
0.0118
*
‑‑‑
Signif.
codes:
0
‘***’
0.001
‘**’
0.01
‘*’
0.05
‘.’
0.1
‘
’
1
(Dispersion
parameter
for
Gamma
family
taken
to
be
0.9388638)
Null
deviance:
26.282
on
16
degrees
of
freedom
Residual
deviance:
19.457
on
15
degrees
of
freedom
AIC:
173.97
Number
of
Fisher
Scoring
iterations:
8
α = 0.05
^
β
0
= 1.6034
^
β
1
= 0.3872
^
β
0
= 8.4775
^
β
1
= 1.6034
^
β
0
= 8.4775
^
β
1
= −1.1093 β
1
^
β
0
= 8.4775
^
β
1
= −1.1093
^
β
0
= 8.4775
^
β
1
= 1.6034 β
1
In [ ]:
In [ ]:
In [ ]:
β
0
E(Y
i
) = exp(β
0
+ β
1
x
i
)
when using the above R calculations? Enter two values, first lower value than higher value, rounded to four
decimal places.
Answer (Lower Limit)꞉
Answer (Upper Limit)꞉
f). [3 marks] By comparing the deviances for two appropriate models, test the null hypothesis
against the alternative hypothesis, for the model
by
completing the following equation꞉
Enter the values rounded to three decimal places in the order in which they appear in the equation.
Answer (Enter the value of D_0 here)꞉
Answer (Enter the value of D_1 here)꞉
Answer (Enter the value of df for statistic here)꞉
Answer (Enter the critical value of the distribution for a significance test at
5\% level)꞉
In [ ]:
In [ ]:
In [ ]:
In [ ]:
β
1
= 0
β
1
≠ 0
E(Y
i
) = exp(β
0
+ β
1
x
i
)
D
0
−D
1
=. . .−. . . =. . . >. . .
In [ ]:
In [ ]:
χ
2
In [ ]:
χ
2
In [ ]:
In [ ]:
1. [2 marks]
What can you conclude about the use of the initial white blood cell count as a predictor of survival time?
Answer (give a statement of your conclusion)꞉
YOU MAY WISH TO SUMMARISE YOUR ANSWERS HERE
QUESTION 1 (ONLY A, B,C,D)
1.
2.
3.
QUESTION 2
1.
2.
3.
4.
5.LOWER
5.UPPER
6.
6.
6.DF
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
D
0
D
1
6.
7.
χ
2
In [ ]: