xuebaunion@vip.163.com

3551 Trousdale Rkwy, University Park, Los Angeles, CA

留学生论文指导和课程辅导

无忧GPA：https://www.essaygpa.com

工作时间：全年无休-早上8点到凌晨3点

微信客服：xiaoxionga100

微信客服：ITCS521

程序代写案例-STAR534

时间：2021-04-22

STAR534 - Homework 7

This is more open-ended assignment than some of the previous assignments. In general, use good data

analysis practices. For example, when you are required to fit a GAM model, decide whether the default

smoothing parameters seem appropriate. If not, change that setting and refit your model. Turn in your final

model and describe the settings you used. In your answers, don’t provide a diary of all the things you tried

and didn’t work. Instead, write a succinct summary describing your final model for each problem.

Recall problem 2 from Homework 2: You are the head data scientist for a bank. Your goal is to predict

whether or not the bank should loan money to an applicant.

The data are based on the article: Min Li, Amy Mickel & Stanley Taylor (2018) “Should This Loan be

Approved or Denied?”: A Large Dataset with Class Assignment Guidelines, Journal of Statistics Education,

26:1, 5-66, https://doi.org/10.1080/10691898.2018.1434342 In Li et al. (2018) the authors explain various

aspects of the data.

Your goal is to build a model to predict the loan status. Loan status is the MIS_Status variable. For the

training data, recode this variable so that loans that were paid off are coded 1 and 0 otherwise. For the test

data, this column has already been re-coded. Use the test data only for evaluating predictive performance.

Please don’t use the test data for any model building as that’s cheating.

I recommend that you do some exploratory data analysis to re-familiarize yourself with these data. Use the

training data for fitting the models (problem 1-3) and the test data for predictive performance (problems 4-5).

1. Fit a logistic regression model to predict using predictors 26-31. Provide the summary table for your

model.

2. Fit a GAM model to these data using smoothing splines. Note: you’ll need to think about this one a

bit. You can’t just run the command gam(y~.)

a. Report the summary table of the results. Write a few sentences comparing the GAM results to

the results from the logistic regression model.

b. Plot the model results (e.g., plot(my.gam)) Discuss the plots - are the smoothed results interpretable?

How do you interpret the plots for the binary predictors?

3. Fit a third type of model to these data. Use whatever model you think is appropriate. The model may

be one we considered in this class or some other model. Report the results in a way that is appropriate

for your model (table and/or plots).

4. For the test data, compare the predictive performance of the models.

a. What an appropriate measure of predictive performance for these data and why did you choose it?

b. Report a table of the results.

c. Which model provides the best predictions?

5. Now consider the larger dataset. Fit a model with more of the predictors. Try to improve on the

predictive performance of your best model from question 4. Describe your final model as appropriate

(plots and/or table, as appropriate).

6. What did you learn with this assignment? If your answer is ‘nothing’, then go back and redo problem 3

by fitting a model that is new to you.

1

学霸联盟

This is more open-ended assignment than some of the previous assignments. In general, use good data

analysis practices. For example, when you are required to fit a GAM model, decide whether the default

smoothing parameters seem appropriate. If not, change that setting and refit your model. Turn in your final

model and describe the settings you used. In your answers, don’t provide a diary of all the things you tried

and didn’t work. Instead, write a succinct summary describing your final model for each problem.

Recall problem 2 from Homework 2: You are the head data scientist for a bank. Your goal is to predict

whether or not the bank should loan money to an applicant.

The data are based on the article: Min Li, Amy Mickel & Stanley Taylor (2018) “Should This Loan be

Approved or Denied?”: A Large Dataset with Class Assignment Guidelines, Journal of Statistics Education,

26:1, 5-66, https://doi.org/10.1080/10691898.2018.1434342 In Li et al. (2018) the authors explain various

aspects of the data.

Your goal is to build a model to predict the loan status. Loan status is the MIS_Status variable. For the

training data, recode this variable so that loans that were paid off are coded 1 and 0 otherwise. For the test

data, this column has already been re-coded. Use the test data only for evaluating predictive performance.

Please don’t use the test data for any model building as that’s cheating.

I recommend that you do some exploratory data analysis to re-familiarize yourself with these data. Use the

training data for fitting the models (problem 1-3) and the test data for predictive performance (problems 4-5).

1. Fit a logistic regression model to predict using predictors 26-31. Provide the summary table for your

model.

2. Fit a GAM model to these data using smoothing splines. Note: you’ll need to think about this one a

bit. You can’t just run the command gam(y~.)

a. Report the summary table of the results. Write a few sentences comparing the GAM results to

the results from the logistic regression model.

b. Plot the model results (e.g., plot(my.gam)) Discuss the plots - are the smoothed results interpretable?

How do you interpret the plots for the binary predictors?

3. Fit a third type of model to these data. Use whatever model you think is appropriate. The model may

be one we considered in this class or some other model. Report the results in a way that is appropriate

for your model (table and/or plots).

4. For the test data, compare the predictive performance of the models.

a. What an appropriate measure of predictive performance for these data and why did you choose it?

b. Report a table of the results.

c. Which model provides the best predictions?

5. Now consider the larger dataset. Fit a model with more of the predictors. Try to improve on the

predictive performance of your best model from question 4. Describe your final model as appropriate

(plots and/or table, as appropriate).

6. What did you learn with this assignment? If your answer is ‘nothing’, then go back and redo problem 3

by fitting a model that is new to you.

1

学霸联盟