4/22/2021 HW3
file:///C:/Users/karen/Downloads/HW3.html 1/3
Assignment 3
ISLR text book problems
Chapter 8: 3, 4.
Coding problem
Implement adaboost.
You can only use the r package rpart or DecisionTreeRegressor in python to grow trees at each iteration.
Train the model using the data_train.txt and test with data_test.txt with following parameters:
Use 20 as the minimum number of observations that must exist in a node in order for a split to be
attempted. minsplit=20 in rpart function or min_samples_split=20 in DecisionTreeRegressor.
Use 10 as the minimum number of observations in any terminal leaf node. minbucket=10 in rpart
function or min_samples_leaf=10 in DecisionTreeRegressor
Try tree depth=1 (stump) and 2 (split twice). This can be achieved by specifying maxdepth in rpart and
max_depth in DecisionTreeRegressor.
Use 1000 iterations (i.e., grow 1000 trees).
Plot the misclassification rate against the iterations in the test data.
The example code is given in the following. The core adaboost function in the example has been removed.
You should implement by yourself.
0.196
0.162
In [1]: niter <- 1000 # number of trees to grow in adaboost
# read training data
data_train <- read.table("/Users/cmx/Documents/courses/HKUST/TA-Machine-Learning/spring_2021/hw3_co
y <- data_train$y
x <- data.matrix(data_train[,-1])
# read testing data
data_test <- read.table("/Users/cmx/Documents/courses/HKUST/TA-Machine-Learning/spring_2021/hw3_cod
hold.out.y <- data_test$y
hold.out.x <- data.matrix(data_test[,-1])
In [3]: # fit adaboost with treedepth 1.
fit_adaboost1 <- adaboost(x,y,hold.out.x,m = niter,treedepth = 1) # You should implement this 'ada
yh_adaboost1 <- predict(fit_adaboost1,hold.out.x)
plot(colMeans(yh_adaboost1$prediction!=hold.out.y),type="l",xlab = "Iterations (trees)",ylab="Miscl
# misclassification rate of the final model
mean(yh_adaboost1$prediction[,niter]!=hold.out.y)
# misclassification rate of the best model
min(colMeans(yh_adaboost1$prediction!=hold.out.y))
4/22/2021 HW3
file:///C:/Users/karen/Downloads/HW3.html 2/3
0.174
0.162
In [4]: # fit adaboost with treedepth 2.
fit_adaboost2 <- adaboost(x,y,hold.out.x,m = niter,treedepth = 2) # You should implement this 'adab
yh_adaboost2 <- predict(fit_adaboost2,hold.out.x)
plot(colMeans(yh_adaboost2$prediction!=hold.out.y),type="l",xlab = "Iterations (trees)",ylab="Miscl
# misclassification rate of the final model
mean(yh_adaboost2$prediction[,niter]!=hold.out.y)
# misclassification rate of the best model
min(colMeans(yh_adaboost2$prediction!=hold.out.y))
4/22/2021 HW3
file:///C:/Users/karen/Downloads/HW3.html 3/3
In [ ]:
学霸联盟