GU4241/GR5241-statistical machine learning代写
时间:2023-04-14
STAT GU4241/GR5241 Uni:
OPTIONAL Midterm-Resubmit Spring 2023 Name:
Total number of points 100.
Show work to get full credit. No cell phones, laptops, watches, calculators. You do NOT need to simplify your answers.
You can answer and choose to submit this exam in one of the three different formats below:
i) You can print the exam file, handwrite (unless you have ODS accommodations for otherwise) your answers, scan the
handwritten exam and upload it to CourseWorks as a single .pdf document, OR
ii) You can handwrite your answers on the exam file on a tablet/computer etc. and upload the handwritten exam as a
single .pdf document to CourseWorks, OR
iii) You can answer the questions on a clean/neat sheet of paper (white or from your notebook), in the order they
appear in the exam, you do NOT need to write the questions down, handwriting the solutions, showing your work would
be fine, and scan it like you do with your HWs and upload it to CourseWorks as a single .pdf document. In this case, you
should write on the first page of your exam answers ‘I have read the instructions on the first page of the exam and
understand them’, and add your name, last name, UNI, and signature underneath the statement.
After you resubmit the exam on CourseWorks you should make sure it uploaded correctly.
Note that:
i) There were two equivalent versions of the in-class midterm, version a, and version b. For the optional resubmit you
can choose to resubmit either version a or version b. It does not have to be the version of the exam you completed in-
class.
ii) You should work on the exam yourself, and you should not discuss the exam questions with your classmates or
anyone else,
iii) You can use any material posted on the course CourseWorks site (HWs, HW solutions, lecture slides etc.), your own
notes from the lectures and the course textbook, however, you should not use other resources from the internet or
from anywhere else,
iv) You can use a calculator if you like, however you do not need to,
v) You have until the submission deadline to work on the exam and submit it, and the same late submission penalties as
the HWs apply to the exam,
vi) Teaching team does not take questions on the exams before the exam deadline. If you have remarks, questions,
make assumptions in answering questions, find an error etc. you should make a note of it on your answer to the
question. These will be taken into account during the grading process.
vii) You should not upload the exam to any site, or share the exam with anyone else, as this would constitute copyright
infringement, and is both illegal and violates student honor code.
Note that:
If your TA reports suspicion of cheating behavior, it will be automatically forwarded to the Committee on Student
Conduct for an investigation and a final decision, even though this is an optional resubmit. If the committee determines
that cheating has taken place, the students involved will receive a grade of F from the course.
Please sign below indicating that you have read the above instructions and understand them:
‘I have read the above instructions and understand them’
Name, Last Name UNI Signature
1) (12 points) Explain one-standard-error rule in the context of model selection through CV.
2) (48 points, 4 points each) Circle correct choice:
i) In general, Least Squares has less interpretability than Lasso.
TRUE FALSE
ii) Ideally, in model selection one wants to choose the model that gives the lowest training MSE.
TRUE FALSE
iii) As one uses more flexible methods, training MSE decreases.
TRUE FALSE
iv) If p>n, backward selection yields a better result.
TRUE FALSE
v) It is possible for collinearity to exist between three or more variables even if no pairs of variables has high correlation.
TRUE FALSE
vi) LOOCV has lower bias than k-fold CV.
TRUE FALSE
vii) Like Lasso, principal components regression can be seen as a feature selection method.
TRUE FALSE
viii) Among best subset, forward stepwise, and backward stepwise selection methods applied on a single data set, one
would expect forward stepwise to have the smallest training RSS.
TRUE FALSE
ix) In the context of regression modeling, regularization makes a model more flexible.
TRUE FALSE
x) Bootstrap method uses sampling without replacement in creating bootstrap samples.
TRUE FALSE
xi) PLS regression places highest weight on variables most correlated with the response variable.
TRUE FALSE
xii) In AdaBoost, the test error typically keeps decreasing even after training error has stabilized at minimal value.
TRUE FALSE
3) (15 points, Each blank 3 points) Fill in the blanks.
a) As one uses more flexible methods, the variance will …………..…….. and the bias will …………..……..
b) Observations that have unusual predictor values are called ………………………..…………
c) Rather than inspecting the correlation matrix, a better way to assess collinearity is to compute …………………………..….
d) One uses …………………………………… to control overfitting in decision trees.
4) (6 points) List three of the four approaches we have seen in class to adjust the training error for the model size.
i)
ii)
iii)
5) (12 points) Explain how you would code the qualitative predictor School Year, which has three levels (freshman,
sophomore, upper-class), in a linear model.
6) (7 points) In Bagging, the trees produced by different bootstrap samples can be very correlated. How do Random
Forests address this issue?
-----------------------------------------End of Midterm STAT GU4241/GR5241 Spring 2023 Resubmit--------------------------------------


essay、essay代写