xuebaunion@vip.163.com

3551 Trousdale Rkwy, University Park, Los Angeles, CA

留学生论文指导和课程辅导

无忧GPA：https://www.essaygpa.com

工作时间：全年无休-早上8点到凌晨3点

微信客服：xiaoxionga100

微信客服：ITCS521

MAT3375-rmarkdown代写

时间：2023-05-31

MAT 3375 – Regression Analysis – Questions

1. (a) Let Ui ∼ χ2(ri) be independent random variables with r1 = 5, r2 = 10. Set

X =

U1/r1

U2/r2

.

Using R, find s and t such that

P (X ≤ s) = .95 and P (X ≤ t) = .99.

(b) Let Z ∼ N(0, 1) and U ∼ χ2(10) be two independent random variables. Let

V =

Z√

U/10

.

Using R, find w such that P (V ≤ w) = 0.95.

2. Let f : Rn → R, v ∈ Rn, and a ∈ R. Define f(Y) = Y⊤v + a. Find the gradient of f with

respect to Y. Write a function in R that computes f(Y) given v, a. Evaluate the function at

Y = (1, 0,−1), for v = (1, 2,−3) and a = −2.

Note: in the course, we will write vectors either as columns format or as rows, in a more or

less arbitrary way. It is up to you to determine which one makes the dimensions compatible.

3. Let A =

(

1 1 0

0 1 −1

)

, µ = (1, 0, 1), Σ =

2 −1 0−1 1 0

0 0 1

, and Y ∼ N (µ,Σ).

Let W = AY. What distribution does the random vector W follow? Draw a sample of size

100 for this random vector with R and plot them in a graph. Note: you may use the function

mvrnorm() from the MASS package to help along (but you do not have to).

4. Let Y ∼ N (0, 9I4) and set Y = 14(Y1 + Y2 + Y3 + Y4). Using R, draw 1000 observations from:

(a) Y 21 + Y

2

2 + Y

2

3 + Y

2

4

(b) 4Y

2

(c) (Y1 − Y )2 + (Y2 − Y )2 + (Y3 − Y )2 + (Y4 − Y )2

In each case, plot a histogram of the observations.

5. Consider the function f : R3 → R defined by

f(Y) = Y 21 +

1

2Y

2

2 +

1

2Y

2

3 − Y1Y2 + Y1 + 2Y2 − 3Y3 − 2.

Using R, find the critical point(s) of f . If it is unique, does it give rise to a global maximum

of f? A global minimum? A saddle point?

6. (a) Identify the response variable Y and the predictor variable X in each of the examples

shown on slides 4 and 5 of the course notes (Chapter 2). Is there a linear relationship

between X and Y . Draw the approximate line of linear fit (and give its equation).

Hint: use screenshots and software (Paint, PowerPoint, GIMP, etc.) to overlay the line.

(b) Consider the 4 examples shown on page 9 of the course notes (chapter 2). Is the variance

of the error terms constant? Are the error terms independent of each other?

1

7. Consider the dataset Autos.xlsx found on Brightspace. The predictor variable is VKM.q (X,

the average daily distance driven, in km); the response variable is CC.q (Y , the average daily

fuel consumption, in L). Use R to:

(a) display the scatterplot of Y versus X;

(b) determine the number of observations n in the dataset;

(c) compute the quantities

∑

Xi,

∑

Yi,

∑

X2i ,

∑

XiYi,

∑

Y 2i ;

(d) find the normal equations of the line of best fit;

(e) find the coefficients of the line of best fit (without using lm()), and

(f) overlay the line of best fit onto the scatterplot.

8. (continuation of the previous question) Use the R function lm() to obtain the coefficients of

the line of best fit and the residuals. Show (by calculating the required quantities directly)

that the first 5 properties of residuals (p.25 in the course notes of Chapter 2) are satisfied.

9. (continuation of the previous question) Using R, compute the Pearson and Spearman corre-

lation coefficients between the predictor and the response. Is there a strong or weak linear

association between these two variables? Use the correlation values and diagrams to justify

your answer.

10. (continuation of the previous question) Using R, find the decomposition into sums of squares

for the regression.

1. (a) Let Ui ∼ χ2(ri) be independent random variables with r1 = 5, r2 = 10. Set

X =

U1/r1

U2/r2

.

Using R, find s and t such that

P (X ≤ s) = .95 and P (X ≤ t) = .99.

(b) Let Z ∼ N(0, 1) and U ∼ χ2(10) be two independent random variables. Let

V =

Z√

U/10

.

Using R, find w such that P (V ≤ w) = 0.95.

2. Let f : Rn → R, v ∈ Rn, and a ∈ R. Define f(Y) = Y⊤v + a. Find the gradient of f with

respect to Y. Write a function in R that computes f(Y) given v, a. Evaluate the function at

Y = (1, 0,−1), for v = (1, 2,−3) and a = −2.

Note: in the course, we will write vectors either as columns format or as rows, in a more or

less arbitrary way. It is up to you to determine which one makes the dimensions compatible.

3. Let A =

(

1 1 0

0 1 −1

)

, µ = (1, 0, 1), Σ =

2 −1 0−1 1 0

0 0 1

, and Y ∼ N (µ,Σ).

Let W = AY. What distribution does the random vector W follow? Draw a sample of size

100 for this random vector with R and plot them in a graph. Note: you may use the function

mvrnorm() from the MASS package to help along (but you do not have to).

4. Let Y ∼ N (0, 9I4) and set Y = 14(Y1 + Y2 + Y3 + Y4). Using R, draw 1000 observations from:

(a) Y 21 + Y

2

2 + Y

2

3 + Y

2

4

(b) 4Y

2

(c) (Y1 − Y )2 + (Y2 − Y )2 + (Y3 − Y )2 + (Y4 − Y )2

In each case, plot a histogram of the observations.

5. Consider the function f : R3 → R defined by

f(Y) = Y 21 +

1

2Y

2

2 +

1

2Y

2

3 − Y1Y2 + Y1 + 2Y2 − 3Y3 − 2.

Using R, find the critical point(s) of f . If it is unique, does it give rise to a global maximum

of f? A global minimum? A saddle point?

6. (a) Identify the response variable Y and the predictor variable X in each of the examples

shown on slides 4 and 5 of the course notes (Chapter 2). Is there a linear relationship

between X and Y . Draw the approximate line of linear fit (and give its equation).

Hint: use screenshots and software (Paint, PowerPoint, GIMP, etc.) to overlay the line.

(b) Consider the 4 examples shown on page 9 of the course notes (chapter 2). Is the variance

of the error terms constant? Are the error terms independent of each other?

1

7. Consider the dataset Autos.xlsx found on Brightspace. The predictor variable is VKM.q (X,

the average daily distance driven, in km); the response variable is CC.q (Y , the average daily

fuel consumption, in L). Use R to:

(a) display the scatterplot of Y versus X;

(b) determine the number of observations n in the dataset;

(c) compute the quantities

∑

Xi,

∑

Yi,

∑

X2i ,

∑

XiYi,

∑

Y 2i ;

(d) find the normal equations of the line of best fit;

(e) find the coefficients of the line of best fit (without using lm()), and

(f) overlay the line of best fit onto the scatterplot.

8. (continuation of the previous question) Use the R function lm() to obtain the coefficients of

the line of best fit and the residuals. Show (by calculating the required quantities directly)

that the first 5 properties of residuals (p.25 in the course notes of Chapter 2) are satisfied.

9. (continuation of the previous question) Using R, compute the Pearson and Spearman corre-

lation coefficients between the predictor and the response. Is there a strong or weak linear

association between these two variables? Use the correlation values and diagrams to justify

your answer.

10. (continuation of the previous question) Using R, find the decomposition into sums of squares

for the regression.