mast30027代写-MAST30027-Assignment 3|学霸联盟

mast30027代写-MAST30027-Assignment 3

时间：2022-12-09

MAST30027: Modern Applied Statistics
Assignment 3 Solution 2022
1. (a) Solution: Let θ = (pi1, pi2, p1, p2, p3) and n = 300.
p(X1, ...Xn, Z1, ..., Zn|θ) =
n∏
i=1
p(Xi|Zi, θ)p(Zi|θ)
=
n∏
i=1
3∏
k=1
[p(Xi|Zi = k, θ)p(Zi = k|θ)]I(Zi=k) .
log[p(X1, ...Xn, Z1, ..., Zn|θ)] =
n∑
i=1
3∑
k=1
I(Zi=k)[log p(Xi|Zi = k, θ) + log p(Zi = k|θ)]
=
n∑
i=1
[
I(Zi=1)
[
log
(
m
xi
)
+ xi log p1 + (m− xi) log(1− p1) + log pi1
]
+ I(Zi=2)
[
log
(
m
xi
)
+ xi log p2 + (m− xi) log(1− p2) + log pi2
]
+ I(Zi=3)
[
log
(
m
xi
)
+ xi log p3 + (m− xi) log(1− p3) + log(1− pi1 − pi2)
]]
.
Q(θ, θ0) = EZ|X,θ0 [log p(X1, ...Xn, Z1, ..., Zn|θ)]
=
n∑
i=1
[
p(Zi = 1|X, θ0)
[
log
(
m
xi
)
+ xi log p1 + (m− xi) log(1− p1) + log pi1
]
+ p(Zi = 2|X, θ0)
[
log
(
m
xi
)
+ xi log p2 + (m− xi) log(1− p2) + log pi2
]
+ p(Zi = 3|X, θ0)
[
log
(
m
xi
)
+ xi log p3 + (m− xi) log(1− p3) + log(1− pi1 − pi2)
]]
.
(b) Solution: Let θ0 = (pi01 , pi
0
2 , p
0
1, p
0
2, p
0
3). For k = 1, 2,
p(Zi = k|Xi, θ0) = p(Zi = k,Xi|θ
0)
p(Xi|θ0)
=
p(Xi|Zi = k, θ0)p(Zi = k|θ0)
p(Zi = 1|θ0)p(Xi|Zi = 1, θ0) + p(Zi = 2|θ0)p(Xi|Zi = 2, θ0) + p(Zi = 3|θ0)p(Xi|Zi = 3, θ0)
=
(
m
xi
)
(p0k)
xi(1− p0k)m−xipi0k(
m
xi
)
(p01)
xi(1− p01)m−xipi01 +
(
m
xi
)
(p02)
xi(1− p02)m−xipi02 +
(
m
xi
)
(p03)
xi(1− p03)m−xi(1− pi01 − pi02)
=
(p0k)
xi(1− p0k)m−xipi0k
(p01)
xi(1− p01)m−xipi01 + (p02)xi(1− p02)m−xipi02 + (p03)xi(1− p03)m−xi(1− pi01 − pi02)
.
p(Zi = 3|X, θ0) = 1− p(Zi = 1|X, θ0)− p(Zi = 2|X, θ0).
(c) Solution: Let’s first compute pˆi1 and pˆi2 which maximise Q(θ, θ
0).
∂Q(θ, θ0)
∂pi1
=
n∑
i=1
[p(Zi = 1|X, θ0)
pi1
− p(Zi = 3|X, θ
0)
1− pi1 − pi2
]
=
(1− pi1 − pi2)[
∑n
i=1 p(Zi = 1|X, θ0)]− pi1[
∑n
i=1 p(Zi = 3|X, θ0)]
pi1(1− pi1 − pi2) = 0 (1)
1
∂Q(θ, θ0)
∂pi2
=
n∑
i=1
[p(Zi = 2|X, θ0)
pi2
− p(Zi = 3|X, θ
0)
1− pi1 − pi2
]
=
(1− pi1 − pi2)[
∑n
i=1 p(Zi = 2|X, θ0)]− pi1[
∑n
i=1 p(Zi = 3|X, θ0)]
pi1(1− pi1 − pi2) = 0. (2)
Let pi3 = 1− pi1 − pi2. From (1) and (2), we obtain
n∑
i=1
p(Zi = 1|X, θ0)pi3 =
n∑
i=1
p(Zi = 3|X, θ0)pi1, (3)
n∑
i=1
p(Zi = 2|X, θ0)pi3 =
n∑
i=1
p(Zi = 3|X, θ0)pi2. (4)
Taking sum of (3) and (4), we have
pi3
[ n∑
i=1
[p(Zi = 1|X, θ0) + p(Zi = 2|X, θ0)]
]
= (1− pi3)
n∑
i=1
p(Zi = 3|X, θ0),
pˆi3 =
∑n
i=1 p(Zi = 3|X, θ0)
n
.
From (3), we have
pˆi1 =
1∑n
i=1 p(Zi = 3|X, θ0)
pˆi3
n∑
i
p(Zi = 1|X, θ0) =
∑n
i=1 p(Zi = 1|X, θ0)
n
.
From (4), we have
pˆi2 =
1∑n
i=1 p(Zi = 3|X, θ0)
pˆi3
n∑
i
p(Zi = 2|X, θ0) =
∑n
i=1 p(Zi = 2|X, θ0)
n
.
For k = 1, 2, 3,
∂Q(θ, θ0)
∂pk
=
n∑
i=1
p(Zi = k|X, θ0)
[xi
pk
− m− xi
1− pk
]
=
(1− pk)
∑n
i=1 p(Zi = k|X, θ0)xi − pk
∑n
i=1 p(Zi = k|X, θ0)(m− xi)
pk(1− pk) = 0.
pˆk =
∑n
i=1 p(Zi = k|X, θ0)xi
m
∑n
i=1 p(Zi = k|X, θ0)
.
(d) Solution:
Implement the EM algorithm.
> # w.init : initial value for pi
> # p.init : initial value for p
> # m : total number of trials in Binomial distribution (m=20)
> # epsilon : If the incomplete log-likelihood has changed by less than epsilon,
> # EM will stop.
> # max.iter : maximum number of EM-iterations
> mixture.EM <- function(X, m, w.init, p.init, epsilon=1e-5, max.iter=100) {
+
+ w.curr = w.init
+ p.curr = p.init
2
++ # store incomplete log-likehoods for each iteration
+ log_liks = c()
+
+ # compute incomplete log-likehoods using initial values of parameters.
+ log_liks = c(log_liks, compute.log.lik(X, m, w.curr, p.curr)$ill)
+
+ # set the change in incomplete log-likelihood with 1
+ delta.ll = 1
+
+ # number of iteration
+ n.iter = 1
+
+ # If the log-likelihood has changed by less than epsilon, EM will stop.
+ while((delta.ll > epsilon) & (n.iter <= max.iter)){
+
+ # run EM step
+ EM.out = EM.iter(X, m, w.curr, p.curr)
+
+ # replace the current value with the new parameter estimate
+ w.curr = EM.out$w.new
+ p.curr = EM.out$p.new
+
+ # incomplete log-likehoods with new parameter estimate
+ log_liks = c(log_liks, compute.log.lik(X, m, w.curr, p.curr)$ill)
+
+ # compute the change in incomplete log-likelihood
+ delta.ll = log_liks[length(log_liks)] - log_liks[length(log_liks)-1]
+
+ # increase the number of iteration
+ n.iter = n.iter + 1
+ }
+
+ return(list(w.curr=w.curr, p.curr=p.curr, log_liks=log_liks))
+
+ }
> EM.iter <- function(X, m, w.curr, p.curr) {
+
+ # E-step: compute E_{Z|X,\theta_0}[I(Z_i = k)]
+
+ # for each sample $X_i$, compute $P(X_i, Z_i=k)$
+ prob.x.z = compute.prob.x.z(X, m, w.curr, p.curr)$prob.x.z
+
+ # compute P(Z_i=k | X_i)
+ P_ik = prob.x.z / rowSums(prob.x.z)
+
+ # M-step
+ w.new = colSums(P_ik)/sum(P_ik) # sum(P_ik) is equivalent to sample size
+ p.new = colSums(P_ik*X)/(m*colSums(P_ik))
+
+ return(list(w.new=w.new, p.new=p.new))
+ }
> # for each sample $X_i$, compute $P(X_i, Z_i=k)$
> compute.prob.x.z <- function(X, m, w.curr, p.curr) {
+
3
+ # for each sample $X_i$, compute $P(X_i, Z_i=k)$.
+ # Store these values in the columns of L:
+ L = matrix(NA, nrow=length(X), ncol= length(w.curr))
+ for(k in seq_len(ncol(L))) {
+ L[, k] = dbinom(X, m, p.curr[k])*w.curr[k]
+ }
+
+ return(list(prob.x.z=L))
+ }
> # Compute incomplete log-likehoods
> compute.log.lik <- function(X, m, w.curr, p.curr) {
+
+ # for each sample $X_i$, compute $P(X_i, Z_i=k)$
+ prob.x.z = compute.prob.x.z(X, m, w.curr, p.curr)$prob.x.z
+
+ # incomplete log-likehoods
+ ill = sum(log(rowSums(prob.x.z)))
+
+ return(list(ill=ill))
+ }
Run the EM algorithm two times with the two initial values provided in the problem and
check that the incomplete log-likelihoods increases at each step by plotting them.
> # read the data
> X = scan(file=
+ "/Users/hjshim/Documents/work/github/teaching/MAS/Assignment_2022/assignment3_prob1.txt"
+ , what=double())
> m = 20
> # 1st initial values
> EM1 <- mixture.EM(X, m = m, w.init=c(0.3,0.3,0.4), p.init=c(0.2, 0.5, 0.7),
+ epsilon=1e-5, max.iter=100)
> ee = EM1
> print(paste("Estimate pi = (", round(ee$w.curr[1],2), ",",
+ round(ee$w.curr[2],2), ",",
+ round(ee$w.curr[3],2), ")", sep=""))
[1] "Estimate pi = (0.12,0.28,0.6)"
> print(paste("Estimate p = (", round(ee$p.curr[1],2), ",",
+ round(ee$p.curr[2],2), ",",
+ round(ee$p.curr[3],2), ")", sep=""))
[1] "Estimate p = (0.09,0.38,0.89)"
> plot(ee$log_liks, ylab='incomplete log-likelihood', xlab='iteration')
4
5 10 15 20
−
12
00
−
11
00
−
10
00
−
90
0
−
80
0
iteration
in
co
m
pl
et
e
lo
g−
lik
e
lih
oo
d
> m = 20
> # 2nd initial values
> EM2 <- mixture.EM(X, m = m, w.init=c(0.1,0.2,0.7), p.init=c(0.1, 0.3, 0.7),
+ epsilon=1e-5, max.iter=100)
> ee = EM2
> print(paste("Estimate pi = (", round(ee$w.curr[1],2), ",",
+ round(ee$w.curr[2],2), ",",
+ round(ee$w.curr[3],2), ")", sep=""))
[1] "Estimate pi = (0.12,0.28,0.6)"
> print(paste("Estimate p = (", round(ee$p.curr[1],2), ",",
+ round(ee$p.curr[2],2), ",",
+ round(ee$p.curr[3],2), ")", sep=""))
[1] "Estimate p = (0.09,0.38,0.89)"
> plot(ee$log_liks, ylab='incomplete log-likelihood', xlab='iteration')
5
5 10 15
−
11
50
−
10
50
−
95
0
−
90
0
−
85
0
−
80
0
iteration
in
co
m
pl
et
e
lo
g−
lik
e
lih
oo
d
We can see that
the incomplete log-likelihoods increase at each EM-step.
Check which estimators have the highest incomplete log-likelihood.
> EM1$log_liks[length(EM1$log_liks)]
[1] -813.0136
> EM2$log_liks[length(EM2$log_liks)]
[1] -813.0136
Estimators from the two EM runs are very similar and they have very similar incomplete
log-likelihoods. So I will report one from the first run as it has slighly higher incom-
plete log-likelihoods. MLEs for the parameters are pˆi1 = 0.1191536, pˆi2 = 0.2847107, pˆ1 =
0.08996094, pˆ2 = 0.37895941, pˆ3 = 0.88998821.
2. (a) Solution: Let θ = (pi1, pi2, p1, p2, p3), n = 300 and n
′ = 100.
p(X1, ...Xn, Z1, ..., Zn, Xn+1, ..., Xn+n′ |θ) =
n∏
i=1
p(Xi|Zi, θ)p(Zi|θ)
n+n′∏
i=n+1
p(Xi|θ)
=
n∏
i=1
3∏
k=1
[p(Xi|Zi = k, θ)p(Zi = k|θ)]I(Zi=k)
n+n′∏
i=n+1
p(Xi|θ).
6
log[p(X1, ...Xn, Z1, ..., Zn, Xn+1, ..., Xn+n′ |θ)]
=
n∑
i=1
3∑
k=1
I(Zi=k)[log p(Xi|Zi = k, θ) + log p(Zi = k|θ)] +
n+n′∑
i=n+1
log p(Xi|θ)
=
n∑
i=1
{
I(Zi=1)
[
log
(
m
xi
)
+ xi log p1 + (m− xi) log(1− p1) + log pi1
]
+ I(Zi=2)
[
log
(
m
xi
)
+ xi log p2 + (m− xi) log(1− p2) + log pi2
]
+ I(Zi=3)
[
log
(
m
xi
)
+ xi log p3 + (m− xi) log(1− p3) + log(1− pi1 − pi2)
]}
+
n+n′∑
i=n+1
[
log
(
m
xi
)
+ xi log p1 + (m− xi) log(1− p1)
]
.
Q(θ, θ0) = EZ|X,θ0 [log p(X1, ...Xn, Z1, ..., Zn, Xn+1, ..., Xn+n′ |θ)]
=
n∑
i=1
{
p(Zi = 1|X, θ0)
[
log
(
m
xi
)
+ xi log p1 + (m− xi) log(1− p1) + log pi1
]
+ p(Zi = 2|X, θ0)
[
log
(
m
xi
)
+ xi log p2 + (m− xi) log(1− p2) + log pi2
]
+ p(Zi = 3|X, θ0)
[
log
(
m
xi
)
+ xi log p3 + (m− xi) log(1− p3) + log(1− pi1 − pi2)
]}
+
n+n′∑
i=n+1
[
log
(
m
xi
)
+ xi log p1 + (m− xi) log(1− p1)
]
.
(b) Solution: E-step: same as the E-step in the solution for the problem 2 (b).
M-step: same as the M-step in the solution for the problem 2 (c) except for the following:
∂Q(θ, θ0)
∂p1
=
n∑
i=1
p(Zi = 1|X, θ0)
[xi
p1
− m− xi
1− p1
]
+
n+n′∑
i=n+1
[xi
p1
− m− xi
1− p1
]
=
(1− p1)
[∑n
i=1 p(Zi = 1|X, θ0)xi +
∑n+n′
i=n+1 xi
]
− p1
[∑n
i=1 p(Zi = 1|X, θ0)(m− xi) +
∑n+n′
i=n+1(m− xi)
]
p1(1− p1)
= 0.
pˆ1 =
∑n
i=1 p(Zi = 1|X, θ0)xi +
∑n+n′
i=n+1 xi
m
[∑n
i=1 p(Zi = 1|X, θ0) + n′
] .
(c) Implement the EM algorithm.
> # X : X_1, ..., X_300 which follow a mixture of Binomial distribution
> # X0 : X_301, .., X_400 which follow a binomial distribution
> # m : number of trials in binomial distribution (m=20)
> # w.init : initial value for pi
> # p.init : initial value for p
> # epsilon : If the incomplete log-likelihood has changed by less than epsilon,
> # EM will stop.
> # max.iter : maximum number of EM-iterations
7
> mixture.EM <- function(X, X0, m, w.init, p.init, epsilon=1e-5, max.iter=100) {
+
+ w.curr = w.init
+ p.curr = p.init
+
+ # store incomplete log-likehoods for each iteration
+ log_liks = c()
+
+ # compute incomplete log-likehoods using initial values of parameters.
+ log_liks = c(log_liks, compute.log.lik(X, X0, m, w.curr, p.curr)$ill)
+
+ # set the change in incomplete log-likelihood with 1
+ delta.ll = 1
+
+ # number of iteration
+ n.iter = 1
+
+ # If the log-likelihood has changed by less than epsilon, EM will stop.
+ while((delta.ll > epsilon) & (n.iter <= max.iter)){
+
+ # run EM step
+ EM.out = EM.iter(X, X0, m, w.curr, p.curr)
+
+ # replace the current value with the new parameter estimate
+ w.curr = EM.out$w.new
+ p.curr = EM.out$p.new
+
+ # incomplete log-likehoods with new parameter estimate
+ log_liks = c(log_liks, compute.log.lik(X, X0, m, w.curr, p.curr)$ill)
+
+ # compute the change in incomplete log-likelihood
+ delta.ll = log_liks[length(log_liks)] - log_liks[length(log_liks)-1]
+
+ # increase the number of iteration
+ n.iter = n.iter + 1
+ }
+
+ return(list(w.curr=w.curr, p.curr=p.curr, log_liks=log_liks))
+
+ }
> EM.iter <- function(X, X0, m, w.curr, p.curr) {
+
+ # E-step: compute E_{Z|X,\theta_0}[I(Z_i = k)]
+
+ # for each sample $X_i$, compute $P(X_i, Z_i=k)$
+ prob.x.z = compute.prob.x.z(X, m, w.curr, p.curr)$prob.x.z
+
+ # compute P(Z_i=k | X_i)
+ P_ik = prob.x.z / rowSums(prob.x.z)
+
+ # M-step
+ w.new = colSums(P_ik)/sum(P_ik) # sum(P_ik) is equivalent to sample size
+ ### Change!!!
+ p.new = rep(NA, length(w.new))
+ p.new[2:3] = (colSums(P_ik*X)/(m*colSums(P_ik)))[2:3]
8
+ p.new[1] = (colSums(P_ik*X)[1] + sum(X0))/( m*(colSums(P_ik)[1] + length(X0)))
+
+ return(list(w.new=w.new, p.new=p.new))
+ }
> # for each sample $X_i$, compute $P(X_i, Z_i=k)$
> compute.prob.x.z <- function(X, m, w.curr, p.curr) {
+
+ # for each sample $X_i$, compute $P(X_i, Z_i=k)$.
+ # Store these values in the columns of L:
+ L = matrix(NA, nrow=length(X), ncol= length(w.curr))
+ for(k in seq_len(ncol(L))) {
+ L[, k] = dbinom(X, m, p.curr[k])*w.curr[k]
+ }
+
+ return(list(prob.x.z=L))
+ }
> # Compute incomplete log-likehoods
> compute.log.lik <- function(X, X0, m, w.curr, p.curr) {
+
+ # for each sample $X_i$, compute $P(X_i, Z_i=k)$
+ prob.x.z = compute.prob.x.z(X, m, w.curr, p.curr)$prob.x.z
+
+ # incomplete log-likehoods
+ ill = sum(log(rowSums(prob.x.z))) + sum(log(dbinom(X0, m, p.curr[1])))
+
+ return(list(ill=ill))
+ }
Run the EM algorithm two times with the two initial values provided in the problem and
check that the incomplete log-likelihoods increases at each step by plotting them.
> # read the data
> X = scan(file=
+ "/Users/hjshim/Documents/work/github/teaching/MAS/Assignment_2022/assignment3_prob1.txt"
+ , what=double())
> X.more = scan(file=
+ "/Users/hjshim/Documents/work/github/teaching/MAS/Assignment_2022/assignment3_prob2.txt"
+ , what=double())
> m = 20
> # 1st initial values
> EM1 <- mixture.EM(X, X0=X.more, m = m, w.init=c(0.3,0.3,0.4), p.init=c(0.2, 0.5, 0.7),
+ epsilon=1e-5, max.iter=100)
> ee = EM1
> print(paste("Estimate pi = (", round(ee$w.curr[1],2), ",",
+ round(ee$w.curr[2],2), ",",
+ round(ee$w.curr[3],2), ")", sep=""))
[1] "Estimate pi = (0.28,0.13,0.6)"
> print(paste("Estimate p = (", round(ee$p.curr[1],2), ",",
+ round(ee$p.curr[2],2), ",",
+ round(ee$p.curr[3],2), ")", sep=""))
[1] "Estimate p = (0.39,0.1,0.89)"
> plot(ee$log_liks, ylab='incomplete log-likelihood', xlab='iteration')
9
0 5 10 15 20 25
−
16
00
−
15
00
−
14
00
−
13
00
−
12
00
−
11
00
−
10
00
iteration
in
co
m
pl
et
e
lo
g−
lik
e
lih
oo
d
> m = 20
> # 2nd initial values
> EM2 <- mixture.EM(X, X0=X.more, m = m, w.init=c(0.1,0.2,0.7), p.init=c(0.1, 0.3, 0.7),
+ epsilon=1e-5, max.iter=100)
> ee = EM2
> print(paste("Estimate pi = (", round(ee$w.curr[1],2), ",",
+ round(ee$w.curr[2],2), ",",
+ round(ee$w.curr[3],2), ")", sep=""))
[1] "Estimate pi = (0.28,0.13,0.6)"
> print(paste("Estimate p = (", round(ee$p.curr[1],2), ",",
+ round(ee$p.curr[2],2), ",",
+ round(ee$p.curr[3],2), ")", sep=""))
[1] "Estimate p = (0.39,0.1,0.89)"
> plot(ee$log_liks, ylab='incomplete log-likelihood', xlab='iteration')
10
5 10 15 20
−
20
00
−
18
00
−
16
00
−
14
00
−
12
00
−
10
00
iteration
in
co
m
pl
et
e
lo
g−
lik
e
lih
oo
d
We can see that the incomplete log-likelihoods increase at each EM-step.
Check which estimators have the highest incomplete log-likelihood.
> EM1$log_liks[length(EM1$log_liks)]
[1] -1024.806
> EM2$log_liks[length(EM2$log_liks)]
[1] -1024.806
Estimators from the two EM runs are very similar and they have the same incomplete log-
likelihoods. So I will report one from the first run. MLEs for the parameters are pˆi1 =
0.2786205, pˆi2 = 0.1258416, pˆ1 = 0.39354944, pˆ2 = 0.09556404, pˆ3 = 0.89018680.