xuebaunion@vip.163.com

3551 Trousdale Rkwy, University Park, Los Angeles, CA

留学生论文指导和课程辅导

无忧GPA：https://www.essaygpa.com

工作时间：全年无休-早上8点到凌晨3点

微信客服：xiaoxionga100

微信客服：ITCS521

统计代写-B 2020/2021

时间：2021-04-12

Homework 3 Statistical Machine Learning II, Semeter B 2020/2021

Notes: Please upload all your code with your assignment on Canvas before 7pm on April

14, 2021. Homework must be neatly written-up or typed for submission. I reserve the right

to refuse homework that is deemed (by me) to be excessively messy.

1. PCA. Consider the monthly log stock returns, in percentages and including dividends,

of Merck & Company, Johnson & Johnson, General Electric, General Motors, Ford Motor

Company, and value-weighted index from January 1960 to December 1999; see the file

m-mrk2vw.txt, which has six columns in the order listed before.

(a) Perform a principal component analysis of the data using the sample covariance matrix.

Try different number of principal components and report the variance explained in

each scenario.

(b) Perform a principal component analysis of the data using Radial kernel. Try different

number of principal components and use CV to tune the kernel parameter σ.

2. Neural networks.. The file daily0005.txt (available on Canvas) contains daily data on

several stocks from 2000 to 2005. We will focus only on the data from IBM. As an initial

step in this problem, you should use daily0005.txt to create a dataset containing daily log

returns on IBM for the available time period. In what follows, use the data from the last 252

days as the test set, and the rest as the training set.

(a) Fit a neural network for predicting the daily log returns, using the three previous log

returns as inputs (Take log of column 4 PRC). Do not use weight decay, and use M

hidden units. For each 1 ≤M ≤ 12, (i) use ten random starting values for the weights;

(ii) for each starting value, compute the test error under the squared loss; and (iii)

produce a box plot of the ten test errors. Finally put these twelve box plots together

in one graph. Comment on your findings.

(b) Fix the number of units at M = 12. Try different weight decay parameters, produce

a box plot for each weight decay parameter, and put these box plots in one graph.

Comment on your findings.

(c) Choose the best weight decay parameter from part (b). Repeat part (a) with this

weight decay. Produce a graph with twelve box plots and comment on your findings.

3. Latent Factor Model (Open problem). In the latent factor model for recommender systems,

we assume that the ratings of user i for item j is modeled by

rij = u

>

i vj + ij ,

where ui ∈ Rk and vj ∈ Rk are user factor and item factor, respectively. The resulted

minimization problem is

min

U,V

∑

(i,j)∈Ω

(rij − u>i vj)2 + λ

n∑

i=1

‖ui‖2 + λ

m∑

j=1

‖vj‖2

Homework 3 Statistical Machine Learning II, Semeter B 2020/2021

In above model, both user factor and item factor are assumed be of the same dimension

k. However, such assumptions seems to be a little bit artificial. In general, the number of

latent user features does not need to be the same as latent item features. Suppose now that

each user is governed by p features ui = (ui,1, ui,2, . . . , ui,p) and each movie by q features

vj = (vj,1, vj,2, . . . , vj,q) and p 6= q, then how can we adjust the latent factor model? Provide

your justifications.

学霸联盟

Notes: Please upload all your code with your assignment on Canvas before 7pm on April

14, 2021. Homework must be neatly written-up or typed for submission. I reserve the right

to refuse homework that is deemed (by me) to be excessively messy.

1. PCA. Consider the monthly log stock returns, in percentages and including dividends,

of Merck & Company, Johnson & Johnson, General Electric, General Motors, Ford Motor

Company, and value-weighted index from January 1960 to December 1999; see the file

m-mrk2vw.txt, which has six columns in the order listed before.

(a) Perform a principal component analysis of the data using the sample covariance matrix.

Try different number of principal components and report the variance explained in

each scenario.

(b) Perform a principal component analysis of the data using Radial kernel. Try different

number of principal components and use CV to tune the kernel parameter σ.

2. Neural networks.. The file daily0005.txt (available on Canvas) contains daily data on

several stocks from 2000 to 2005. We will focus only on the data from IBM. As an initial

step in this problem, you should use daily0005.txt to create a dataset containing daily log

returns on IBM for the available time period. In what follows, use the data from the last 252

days as the test set, and the rest as the training set.

(a) Fit a neural network for predicting the daily log returns, using the three previous log

returns as inputs (Take log of column 4 PRC). Do not use weight decay, and use M

hidden units. For each 1 ≤M ≤ 12, (i) use ten random starting values for the weights;

(ii) for each starting value, compute the test error under the squared loss; and (iii)

produce a box plot of the ten test errors. Finally put these twelve box plots together

in one graph. Comment on your findings.

(b) Fix the number of units at M = 12. Try different weight decay parameters, produce

a box plot for each weight decay parameter, and put these box plots in one graph.

Comment on your findings.

(c) Choose the best weight decay parameter from part (b). Repeat part (a) with this

weight decay. Produce a graph with twelve box plots and comment on your findings.

3. Latent Factor Model (Open problem). In the latent factor model for recommender systems,

we assume that the ratings of user i for item j is modeled by

rij = u

>

i vj + ij ,

where ui ∈ Rk and vj ∈ Rk are user factor and item factor, respectively. The resulted

minimization problem is

min

U,V

∑

(i,j)∈Ω

(rij − u>i vj)2 + λ

n∑

i=1

‖ui‖2 + λ

m∑

j=1

‖vj‖2

Homework 3 Statistical Machine Learning II, Semeter B 2020/2021

In above model, both user factor and item factor are assumed be of the same dimension

k. However, such assumptions seems to be a little bit artificial. In general, the number of

latent user features does not need to be the same as latent item features. Suppose now that

each user is governed by p features ui = (ui,1, ui,2, . . . , ui,p) and each movie by q features

vj = (vj,1, vj,2, . . . , vj,q) and p 6= q, then how can we adjust the latent factor model? Provide

your justifications.

学霸联盟