IEOR 165 – Course Project
Due Friday, May 7, 2021
Instructions:
The course project must be submitted on bCourses as a PDF file. You are allowed to consult and
discuss with classmates and others, but each student must submit their own project writeup and
code. The project will be graded on the basis of the quality of the modeling approach. You can
use whichever software and libraries/packages you would like, and are not expected to implement
statistical estimation algorithms yourself.
The authors of the following research paper:
Cortez, A. Cerdeira, F. Almeida, T. Matos, and J. Reis, “Modeling wine preferences by data mining
from physicochemical properties”, Decision Support Systems, vol. 47, no. 4:547-553, 2009.
considered the problem of modeling wine preferences. Wine can be evaluated by experts who
give a subjective score, and the question the authors of this paper considered was how to build
a model that relates objective features of the wine (e.g., pH values) to its rated quality. For this
project, we will use the data set available at:
http://courses.ieor.berkeley.edu/ieor165/homeworks/winequality-red.csv
Use the following methods to identify the coefficients of a linear model relating wine quality to
different features of the wine: (1) ordinary least squares (OLS), (2) ridge regression (RR), (3)
lasso regression. Make sure to include a constant (intercept) term in your model, and choose the
tuning parameters using cross-validation. You may use any programming language you would like
to. For your solutions, please include (i) plots of tuning parameters versus cross-validation error,
(ii) tables of coefficients (labeled by the feature) computed by each method, and (iii) the source
code used to generate the plots and coefficients. Some hints are below:
 a constant (intercept) term can be included in OLS by solving[
βˆ0
βˆ
]
= argmin
β0,β
∥∥∥∥Y − [1n X] [β0β
]∥∥∥∥2
2
 RR and lasso have one tuning parameter
 RR (with an intercept term) can be formulated as[
βˆ0
βˆ
]
= argmin
β0,β
∥∥∥∥[Y0
]

[
1n X
0 µ · I
] [
β0
β
]∥∥∥∥2
2
,
where µ is a tuning parameter.
1