R代写-STAT 3006-Assignment 1

STAT 3006 Assignment 1
Due date: 5:00 pm on 16 February
(25%)Q1: Please use the bisection method to find all zero points of the following function,
f(x) = x3 + 3.6x2 + 0.8x− 7.12.
(25%)Q2 (Poisson regression): We collected n = 50 independent count observations {yi : i =
1, . . . , n} and their corresponding covariates {xi : i = 1, . . . , n}. Assume the relationship
between yi and xi (for i = 1, . . . , n) is yi ∼ Poisson(λi) and log(λi) = α+ βxi + γx2i . Please 1)
write down the likelihood function L(α, β, γ|x,y) of the Poisson regression model; 2) derive
the Newton method for maximizing L(α, β, γ|x,y); 3) implement the Newton method using
R to get MLE of (α, β, γ). (The data set {(xi, yi) : 1 ≤ i ≤ n} is stored in “PoisRegData.txt”.)
(20%)Q3 (Logistic regression): We collected n = 50 independent binary observations {yi :
i = 1, . . . , n} and their corresponding covariates {xi : i = 1, . . . , n}. Assume the relationship
between yi and xi (for i = 1, . . . , n) is yi ∼ Bernoulli(pi) and logit(pi) = α + βxi, where
logit(t) = log t
1−t . Please 1) write down the likelihood function L(α, β|x,y) of the logistic
regression model; 2) derive the Newton method for maximizing L(α, β|x,y); 3) implement the
Newton method using R to get MLE of (α, β). (The data set {(xi, yi) : 1 ≤ i ≤ n} is stored in
“LogitRegData.txt”.)
(30%)Q4 (EM algorithm): The monthly salary of n = 8000 employees are drawn from a
company. Assume that there are three salary levels, including low income, middle income
and high income. We denote the monthly salary of employee i by Yi, and the salary level of
employee i by Zi. {Yi : 1 ≤ i ≤ n} are observed, but {Zi : 1 ≤ i ≤ n} are unknown. Our model
can be formulated as follows. First, Pr(Zi = k) = pik, k = 1, 2, 3 and
∑3
k=1 pik = 1, where
Zi = 1 indicates employee i is low-income, Zi = 2 indicates employee i is middle-income, Zi = 3
indicates employee i is high-income, and pi can be interpreted as the proportion of employees
belonging to each salary level. Second, given Zi = k, k = 1, 2, 3, Yi is assumed to be from a
normal distribution N(µk, σ
2
k). Based on these notations and information, please 1) write down
the complete-data likelihood function L(pi1, pi2, µ1, µ2, µ3, σ1, σ2, σ3|Y,Z); 2) derive E step and
M step to find MLE of (pi1, pi2, µ1, µ2, µ3, σ1, σ2, σ3); 3) use R to implement your EM algorithm,
give MLE of (pi1, pi2, µ1, µ2, µ3, σ1, σ2, σ3), and distinguish the first 50 employees’ salary level.
(The data set {Yi : 1 ≤ i ≤ n} is stored in “SalaryData.txt”.)
Requirements: your answer must contain two parts. The first part is a paper report which
1
includes your derivation and answers for each problem. The second part is a file which includes
and your R code file to the blackboard system or TA (m2ng@link.cuhk.edu.hk). You must finish
both of the two parts to get a grade. Otherwise, your homework will be regarded as missing.
Details of requirements are in the table below.
- in the paper report in the R code file
Q1 all zero points R code for implementing the bisection method
Q2 likelhood function R code for implementing Newton method
derivation procedure for Newton algorithm
MLE of (α, β)
Q3 likelhood function R code for implementing Newton method
derivation procedure for Newton algorithm
MLE of (α, β)
Q4 complete data likelihood function L R code for implementing EM algorithm
derivation procedure for E step and M step
MLE of (pi1, pi2, µ1, µ2, µ3, σ1, σ2, σ3)
The first 50 salary levels you learned
2 