xuebaunion@vip.163.com

3551 Trousdale Rkwy, University Park, Los Angeles, CA

留学生论文指导和课程辅导

无忧GPA：https://www.essaygpa.com

工作时间：全年无休-早上8点到凌晨3点

微信客服：xiaoxionga100

微信客服：ITCS521

留学生代写-MAST90084-Assignment 2

时间：2021-04-01

MAST90084: Statistical Modelling Assignment 2

1. Let yi = (yi1, · · · , yiq)T be a q×1 random vector following a probability distribution from multi-parameter

exponential family. Namely, the pdf of yi is f(yi|θi, φ, wi) = exp

{

yTi θi − b(θi)

φ

wi + c(yi, φ, wi)

}

, where

θi = (θi1, · · · , θiq)T is a q × 1 natural parameter vector, φ is a dispersion parameter and wi is a weight.

It is known that E

[

∂ ln f

∂θij

]

= 0, j = 1, · · · , q; and E

[

∂2 ln f

∂θij∂θij′

]

+ E

[(

∂ ln f

∂θij

)

·

(

∂ ln f

∂θij′

)]

= 0, j, j′ =

1, · · · , q. Using these two properties, show that E(yij) = ∂b(θi)

∂θij

, Var(yij) =

φ

wi

· ∂

2b(θi)

∂θ2ij

and

Cov(yij , yij′) =

φ

wi

· ∂

2b(θi)

∂θij∂θij′

, j, j′ = 1, · · · , q. [5]

2. You need to install the R package faraway to do this question. The hsb data was collected from the High

School and Beyond Study. Type help(hsb) to see the description of the dataset. We want to see how the

relevant variables in the data are related to the choice of the type of program — academic, vocational, or

general — that the students pursue in high school. The response is multinomial with three levels.

For this problem, you have to show your R code in a clean manner and add appropriate

comments when necessary. Points may be deducted for disorganized R code.

(a) Fit a trinomial response model with all the other variables other than id as predictors (untrans-

formed). Use “academic” as the base level for the response “prog”. Show the estimated coefficients

and their standard errors. [5]

(b) For the student with id 99, compute and show the predicted probabilities of the three possible choices.

[5]

3. In this problem you are to reproduce some of the results in Table 2.7 for Example 2.7 in Fahrmeir and Tutz

(F & T). You have to show your R code in a clean manner and add appropriate comments

when necessary. Points may be deducted for disorganized R code.

(a) The second column of Table 2.7 in F & T employs the variance funciton σ2(µ) = φµ. In class,

we have reproduced the “robust” p-values shown in the parentheses using the “robust.variance”

directly accessible from the gee object produce with the gee() function. For this sub-problem you are

asked to compute the robust.variance matrix “by hand”, i.e. using elementary matrix operations

in R based on the sandwich-estimator formula. When doing this, you are allowed to use other

information accessible from the gee object, such as scale, fitted.values, linear.predictors,

residuals, naive.variance, etc. However, you must clearly demonstrate that you can reproduce

the robust.variance matrix using the sandwich-estimator formula. [10]

(b) Now you are to reproduce ALL the numbers presented in the third column of Table 2.7 in F & T

where the variance function is taken to be σ2(µ) = µ + θµ2, using the methodology that alternates

between estimations of θ (by method of moments) and β (by solving a score equation) we have

discussed in lecture. You should pay attention to the following:

i. You are advised to use glm.nb() from the MASS package, which employs a full-on likelihood

approach, to obtain an intial estimate for θ. This initial value for θ can subsequently serve as

the input for the alternating estimation procedure.

MAST90084 Statistical Modelling Assignment 2 Semester 1, 2021

ii. To solve the score equation for β, you can use glm() and the function negative.binomial()

from the MASS package.

iii. Read the documents of any R functions that you are unsure about CAREFULLY, as well as

section 7.4 in Modern Applied Statistics with S by Venables and Ripley if necessary.

iv. Your numbers may differ slightly from those in F & T due to numerical differences. However,

they should be very close in general.

v. Present your work in a CLEAN way.

[25]

Total marks = 50

2

学霸联盟

1. Let yi = (yi1, · · · , yiq)T be a q×1 random vector following a probability distribution from multi-parameter

exponential family. Namely, the pdf of yi is f(yi|θi, φ, wi) = exp

{

yTi θi − b(θi)

φ

wi + c(yi, φ, wi)

}

, where

θi = (θi1, · · · , θiq)T is a q × 1 natural parameter vector, φ is a dispersion parameter and wi is a weight.

It is known that E

[

∂ ln f

∂θij

]

= 0, j = 1, · · · , q; and E

[

∂2 ln f

∂θij∂θij′

]

+ E

[(

∂ ln f

∂θij

)

·

(

∂ ln f

∂θij′

)]

= 0, j, j′ =

1, · · · , q. Using these two properties, show that E(yij) = ∂b(θi)

∂θij

, Var(yij) =

φ

wi

· ∂

2b(θi)

∂θ2ij

and

Cov(yij , yij′) =

φ

wi

· ∂

2b(θi)

∂θij∂θij′

, j, j′ = 1, · · · , q. [5]

2. You need to install the R package faraway to do this question. The hsb data was collected from the High

School and Beyond Study. Type help(hsb) to see the description of the dataset. We want to see how the

relevant variables in the data are related to the choice of the type of program — academic, vocational, or

general — that the students pursue in high school. The response is multinomial with three levels.

For this problem, you have to show your R code in a clean manner and add appropriate

comments when necessary. Points may be deducted for disorganized R code.

(a) Fit a trinomial response model with all the other variables other than id as predictors (untrans-

formed). Use “academic” as the base level for the response “prog”. Show the estimated coefficients

and their standard errors. [5]

(b) For the student with id 99, compute and show the predicted probabilities of the three possible choices.

[5]

3. In this problem you are to reproduce some of the results in Table 2.7 for Example 2.7 in Fahrmeir and Tutz

(F & T). You have to show your R code in a clean manner and add appropriate comments

when necessary. Points may be deducted for disorganized R code.

(a) The second column of Table 2.7 in F & T employs the variance funciton σ2(µ) = φµ. In class,

we have reproduced the “robust” p-values shown in the parentheses using the “robust.variance”

directly accessible from the gee object produce with the gee() function. For this sub-problem you are

asked to compute the robust.variance matrix “by hand”, i.e. using elementary matrix operations

in R based on the sandwich-estimator formula. When doing this, you are allowed to use other

information accessible from the gee object, such as scale, fitted.values, linear.predictors,

residuals, naive.variance, etc. However, you must clearly demonstrate that you can reproduce

the robust.variance matrix using the sandwich-estimator formula. [10]

(b) Now you are to reproduce ALL the numbers presented in the third column of Table 2.7 in F & T

where the variance function is taken to be σ2(µ) = µ + θµ2, using the methodology that alternates

between estimations of θ (by method of moments) and β (by solving a score equation) we have

discussed in lecture. You should pay attention to the following:

i. You are advised to use glm.nb() from the MASS package, which employs a full-on likelihood

approach, to obtain an intial estimate for θ. This initial value for θ can subsequently serve as

the input for the alternating estimation procedure.

MAST90084 Statistical Modelling Assignment 2 Semester 1, 2021

ii. To solve the score equation for β, you can use glm() and the function negative.binomial()

from the MASS package.

iii. Read the documents of any R functions that you are unsure about CAREFULLY, as well as

section 7.4 in Modern Applied Statistics with S by Venables and Ripley if necessary.

iv. Your numbers may differ slightly from those in F & T due to numerical differences. However,

they should be very close in general.

v. Present your work in a CLEAN way.

[25]

Total marks = 50

2

学霸联盟