python/winbugs代写-ISYE 6420

ISYE 6420 Homework 6 Solution, Spring 2019
Problem 1
(a) The original data summarizes the number of breakdowns at each voltage level and we
can convert the outcome to be binary. As shown in our lecture note, we have
yi =
{
1 if zi > 0,
0 if zi ≤ 0,
zi|β ∼ N(β0 + β1xi, 1),
θi = p(yi = 1) = p(zi > 0) = Φ(β0 + β1xi).
Assuming noninformative prior p(β) ∝ 1, the full conditional distributions for zi and β can
be derived as follows:
β|z ∼ N((X ′X)−1X ′z, (X ′X)−1),
p(zi|β, y) =
{
p(zi|β, zi > 0) if yi = 1
p(zi|β, zi ≤ 0) if yi = 0.
The Gibbs sampler is implemented by the following R code:
###################################
library(numDeriv)
library(MASS)
library(coda)
library(mcmc)
library(mined)
#Problem#1
voltage=c(1065,1071,1075,1083,1089,1094,1100,1107,1111,1120,1128,1135)
trials=rep(100,12)
breakdown=c(2,3,5,11,10,21,29,48,56,88,98,99)
voltagec=voltage-mean(voltage)#center the voltage
summary(a)
m=10000
beta=matrix(0,nrow=m,ncol=2)
z=numeric(100)
1
zbar=matrix(0,nrow=m,ncol=12)
beta[1,]=a\$coef
library(MASS)
X=as.matrix(cbind(1,voltagec))
Sinv=solve(t(X)%*%X)
library(truncnorm)
for(i in 2:m)
{
for(j in 1:12)
{
mu=sum(X[j,]*beta[i-1,])
z1=rtruncnorm(breakdown[j],a=0,b=Inf,mean=mu,sd=1)
z0=rtruncnorm(100-breakdown[j],a=-Inf,b=0,mean=mu,sd=1)
z=c(z1,z0)
zbar[i,j]=mean(z)
}
beta[i,]=mvrnorm(1,Sinv%*%t(X)%*%zbar[i,],1/100*Sinv)
}
library(coda)
effectiveSize(beta)
cumuplot(mcmc(beta,start=1,end=m,thin=10))
summary(beta)
#1(a)
par(mfrow=c(1,2))
plot(density(beta[,1]),main=expression(beta[0]),xlab="")
plot(density(beta[,2]),main=expression(beta[1]),xlab="")
par(mfrow=c(1,1))
#1(b)
ld50=-beta[,1]/beta[,2]+mean(voltage)
plot(density(ld50),main="LD50")
###################################
For m = 20, 000, the posterior densities for β0 and β1 are shown in Figure 1.
(b) Since θ = 0.5 ⇐⇒ β0 + β1x = 0 ⇐⇒ x = −β0β1 , the LD50 point corresponds to −
β0
β1
.
The distribution of LD50 is plotted in Figure 2.
Problem 2
2
Figure 1: Posterior densities for β0 and β1.
−0.7 −0.4
0
2
4
6
8
β0
D
en
si
ty
0.055 0.070
0
20
40
60
80
10
0
β1
D
en
si
ty
If we fit a logistic regression model to the above data, the posterior distribution of β is
p(β|y) ∝
12∏
i=1
θyii (1− θi)100−yi
=
12∏
i=1
(
ex
T
i β
1 + ex
T
i β
)yi(
1
1 + ex
T
i β
)100−yi .
Note that yi here is the number of breakdowns instead of 0-1. Both mcmc and adaptMCMC
packages use log of unnormalized posterior as input, and we can obtain the following results
via the R code provided.
###################################################################################
summary(a)
logh=function(beta){
val=sum(breakdown*(X%*%beta))-100*sum(log(1+exp(X%*%beta)))
return(val)
}
nlogh=function(theta) -logh(theta)
betahat=optim(a\$coef,nlogh)\$par
Sigmahat=solve(hessian(nlogh,betahat))
3
Figure 2: Distribution of LD50 using Gibbs sampling.
1102 1103 1104 1105 1106 1107 1108 1109
0.
0
0.
1
0.
2
0.
3
0.
4
0.
5
Density plot for LD50
N = 18001 Bandwidth = 0.09821
D
en
si
ty
#mcmc
eig=eigen(Sigmahat)
sqrtSigma=eig\$vec%*%diag(sqrt(eig\$val))%*%t(eig\$vec)
s=2.4/sqrt(2)*sqrtSigma
m=10000
out=metrop(logh,initial=betahat,nbatch=m,scale=s)
out\$accept
beta=out\$batch
effectiveSize(beta)
cumuplot(mcmc(beta,start=100,end=m,thin=10))
1-rejectionRate(mcmc(beta))
ld50=-beta[,1]/beta[,2]+mean(voltage)
plot(density(ld50),main="LD50")
s=(2.4/sqrt(2))^2*Sigmahat#specify covariance matrix of the jumping distribution
out\$accept
beta=out\$samples
effectiveSize(beta)
4
cumuplot(mcmc(beta,thin=10))
1-rejectionRate(mcmc(beta))
ld50=-beta[,1]/beta[,2]+mean(voltage)
plot(density(ld50),main="LD50")
#mined
#first find approximate ranges for the parameters
summary(a)
a\$coef[1]+3*c(-1,1)*.0945
a\$coef[2]+3*c(-1,1)*.0064
l1=-1.2
u1=-.6
l2=.09
u2=.14
X=as.matrix(cbind(1,voltagec))
logh=function(x){
beta=x
beta[1]=l1+(u1-l1)*x[1]
beta[2]=l2+(u2-l2)*x[2]
val=sum(breakdown*(X%*%beta))-100*sum(log(1+exp(X%*%beta)))
return(val)
}
ini=Lattice(109,2)
D=mined(ini,logh)\$points
beta=matrix(0,nrow=109,ncol=2)
beta[,1]=l1+(u1-l1)*D[,1]
beta[,2]=l2+(u2-l2)*D[,2]
ld50=-beta[,1]/beta[,2]+mean(voltage)
plot(density(ld50),main="LD50")
###################################
5
Figure 3: Distribution of LD50 using mcmc.
1103 1104 1105 1106 1107 1108 1109
0.
0
0.
1
0.
2
0.
3
0.
4
0.
5
LD50
N = 10000 Bandwidth = 0.1082
D
en
si
ty
Figure 4: Distribution of LD50 using adaptMCMC.
1103 1105 1107 1109
0.
0
0.
1
0.
2
0.
3
0.
4
0.
5
N = 10000 Bandwidth = 0.1089
D
en
si
ty
Figure 5: Distribution of LD50 using mined.
1103 1104 1105 1106 1107 1108
0.
0
0.
1
0.
2
0.
3
0.
4
LD50 from mined
N = 109 Bandwidth = 0.2728
D
en
si
ty
6