ml代写-DATA 311|学霸联盟

ml代写-DATA 311

时间：2021-10-10

DATA 311 Midterm Practice Problems
1. Describe what the “Bayes classifier” is. Will it result in 0 misclassifications?
2. What is a p-value?
3. What is wrong with the following analysis, and how would you fix it?
The data give the chemical composition of ancient pottery found at four sites in Great Britain. We will fit
a linear model with Calcium (Ca) as the response using Site and Magnesium (Mg) as predictors.
> ###Load data
> library(car)
> data(Pottery)
> ###Recode Site
> Pottery$Site <- as.numeric(Pottery$Site)
> Pottery$Site
[1] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 2 2 3 3 3 3 3 1 1 1 1 1
> plm <- lm(Pottery$Ca~Pottery$Site+Pottery$Mg)
> summary(plm)
Call:
lm(formula = Pottery$Ca ~ Pottery$Site + Pottery$Mg)
Residuals:
Min 1Q Median 3Q Max
-0.08619 -0.02989 -0.01557 0.02959 0.09908
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.070299 0.029655 2.371 0.0265 *
Pottery$Site -0.025603 0.012809 -1.999 0.0576 .
Pottery$Mg 0.049344 0.007037 7.012 3.81e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.05256 on 23 degrees of freedom
Multiple R-squared: 0.752, Adjusted R-squared: 0.7304
F-statistic: 34.87 on 2 and 23 DF, p-value: 1.088e-07
4. What are the inferential assumptions for simple linear regression? Suppose we knew the predictor X was
distributed uniformly, would that violate the assumptions?
5. Describe one nonparametric method we have introduced to perform regression.
6. Why should we not use R2 to compare the performance of linear regression models with differing numbers
of predictors?

ml代写