xuebaunion@vip.163.com

3551 Trousdale Rkwy, University Park, Los Angeles, CA

留学生论文指导和课程辅导

无忧GPA：https://www.essaygpa.com

工作时间：全年无休-早上8点到凌晨3点

微信客服：xiaoxionga100

微信客服：ITCS521

程序代写案例-STAT 4051

时间：2021-03-13

Homework Four

STAT 4051

This assignment must be typed in easy-to-read font. All of your answers should be in

language easy for the investigators to understand. Do not include any ‘computer-ese’ (e.g.,

code or computer output) as part of your answers to parts 1-6.

In 1986, no longitudinal surveys existed in China, and all surveys were either very narrow

health, economic or demographic surveys. Furthermore, no raw data from any survey had been al-

lowed out of the country. Since China’s reform and open policy, the country was being transformed

from one facing famine and extreme food shortages to one in which the food supply addressed ba-

sic needs and the initial states of a major transformation of the food distribution and marketing

system were occurring. The China Health and Nutrition Survey (CHNS) was established with a

goal of developing a multipurpose longitudinal survey that would allow the group to examine a

series of economic, sociological, demographic and health questions of interest. The CHNS inves-

tigators recruited households ranging from 1 to 14 members (mode 4 members) in nine provinces

to participate in this comprehensive study.

We are interested in the following factors potentially related to each adult participant’s body

mass index.

• BMIi: body mass index (in kg/m2) for participant i in 1989

• AGEi: age (in years) for participant i in 1989

• CALi: average daily caloric intake (in kilocalories divided by 1000) for subject i in 1989

• PROi: average percentage of calories from protein for subject i in 1989

• MALEi: takes value 1 if subject i is male and 0 otherwise

• URBANi: takes value 1 if subject i lived in urban area in 1989 and takes value 0 otherwise

The data are available in ASCII format on canvas in the file ga1.dat.

After meeting with the investigators, you determine that you should fit the following model to

the data to answer their questions.

BMIi = β0 + β1(AGEi − 30) + β2(CALi − 3) + β3(PROi − 10) + β4MALEi

+β5URBANi + β6URBANi(CALi − 3) + εi (1)

1

assuming εi

iid∼ N(0, σ2).

1. Provide a clearly-labeled table describing the variables provided. For continuous variables,

report means, standard deviations, minima, and maxima. For binary variables, provide fre-

quencies and percentages. For each variable, be sure to indicate what percentage of obser-

vations are missing.

2. Provide point estimates (βˆ, σˆ2) and accompanying interval estimates for βˆ. Provide clear in-

terpretations of the associations between BMI and the predictor variables age, caloric intake,

protein, gender, and urbanicity.

3. Often we are interested in hypotheses of the form Cβ =θ0. Provide the C and θ0 matrices

needed to test the hypothesis that caloric intake is unrelated to body mass index using a

single test. Carry out the test using the knowledge of linear regression, report the results

(presenting the test statistic, degrees of freedom, test used, and p-value), and interpret the

test results in language someone without a statistics background can understand.

4. What are the typical assumptions used in estimation and inference under the model in (1)?

Determine whether these assumptions appear to hold in these data to the extent possible,

providing evidence to support your determination.

5. Describe in words the hypothesis tested by each choice of C and θ0 below.

(a) C =

(

0 1 0 0 0 0 0

)

, θ0 = 0

(b) C =

(

1 0 0 0 0 0 0

)

, θ0 = 22

2

(c) C =

(

0 0 0 0 0 1 0

0 0 0 0 0 0 1

)

, θ0 =

(

0

0

)

(d) C =

(

0 0 0 0 0 1 1

)

, θ0 = 0. How does the hypothesis tested by this contrast

differ from the one tested by the previous contrast?

6. Read the introduction notes on canvas and then answer the question

Suppose the CHNS investigators wish to use these baseline data to help design a future study

of a similar population. This new study will follow 2000 men and 2000 women from age 40

to age 50, with BMI measurements taken every two years. Investigators anticipate that the

correlation between any two BMI measures on an individual will be approximately 0.65 and

assume that the rate of BMI change over time will be a linear increase. Using a two-sided test

with α = 0.05 and power= 0.80, what is the minimum detectable difference in the rate of

BMI change between men and women under these settings? Provide two plots showing how

the minimum detectable difference in slopes changes as you vary (a) ρ and (b) N . Clearly

describe all assumptions you make in order to carry out the power analysis.

7. To facilitate reproducible research, a single file containing all code to read in data (processing

code) and reproduce analysis (analytic code) should be uploaded to canvas. This code should

be clearly documented so that a colleague can run it without making any edits.

We are very thankful for Prof. Amy Herring from Duke University who provided the interesting

data set.

3

学霸联盟

STAT 4051

This assignment must be typed in easy-to-read font. All of your answers should be in

language easy for the investigators to understand. Do not include any ‘computer-ese’ (e.g.,

code or computer output) as part of your answers to parts 1-6.

In 1986, no longitudinal surveys existed in China, and all surveys were either very narrow

health, economic or demographic surveys. Furthermore, no raw data from any survey had been al-

lowed out of the country. Since China’s reform and open policy, the country was being transformed

from one facing famine and extreme food shortages to one in which the food supply addressed ba-

sic needs and the initial states of a major transformation of the food distribution and marketing

system were occurring. The China Health and Nutrition Survey (CHNS) was established with a

goal of developing a multipurpose longitudinal survey that would allow the group to examine a

series of economic, sociological, demographic and health questions of interest. The CHNS inves-

tigators recruited households ranging from 1 to 14 members (mode 4 members) in nine provinces

to participate in this comprehensive study.

We are interested in the following factors potentially related to each adult participant’s body

mass index.

• BMIi: body mass index (in kg/m2) for participant i in 1989

• AGEi: age (in years) for participant i in 1989

• CALi: average daily caloric intake (in kilocalories divided by 1000) for subject i in 1989

• PROi: average percentage of calories from protein for subject i in 1989

• MALEi: takes value 1 if subject i is male and 0 otherwise

• URBANi: takes value 1 if subject i lived in urban area in 1989 and takes value 0 otherwise

The data are available in ASCII format on canvas in the file ga1.dat.

After meeting with the investigators, you determine that you should fit the following model to

the data to answer their questions.

BMIi = β0 + β1(AGEi − 30) + β2(CALi − 3) + β3(PROi − 10) + β4MALEi

+β5URBANi + β6URBANi(CALi − 3) + εi (1)

1

assuming εi

iid∼ N(0, σ2).

1. Provide a clearly-labeled table describing the variables provided. For continuous variables,

report means, standard deviations, minima, and maxima. For binary variables, provide fre-

quencies and percentages. For each variable, be sure to indicate what percentage of obser-

vations are missing.

2. Provide point estimates (βˆ, σˆ2) and accompanying interval estimates for βˆ. Provide clear in-

terpretations of the associations between BMI and the predictor variables age, caloric intake,

protein, gender, and urbanicity.

3. Often we are interested in hypotheses of the form Cβ =θ0. Provide the C and θ0 matrices

needed to test the hypothesis that caloric intake is unrelated to body mass index using a

single test. Carry out the test using the knowledge of linear regression, report the results

(presenting the test statistic, degrees of freedom, test used, and p-value), and interpret the

test results in language someone without a statistics background can understand.

4. What are the typical assumptions used in estimation and inference under the model in (1)?

Determine whether these assumptions appear to hold in these data to the extent possible,

providing evidence to support your determination.

5. Describe in words the hypothesis tested by each choice of C and θ0 below.

(a) C =

(

0 1 0 0 0 0 0

)

, θ0 = 0

(b) C =

(

1 0 0 0 0 0 0

)

, θ0 = 22

2

(c) C =

(

0 0 0 0 0 1 0

0 0 0 0 0 0 1

)

, θ0 =

(

0

0

)

(d) C =

(

0 0 0 0 0 1 1

)

, θ0 = 0. How does the hypothesis tested by this contrast

differ from the one tested by the previous contrast?

6. Read the introduction notes on canvas and then answer the question

Suppose the CHNS investigators wish to use these baseline data to help design a future study

of a similar population. This new study will follow 2000 men and 2000 women from age 40

to age 50, with BMI measurements taken every two years. Investigators anticipate that the

correlation between any two BMI measures on an individual will be approximately 0.65 and

assume that the rate of BMI change over time will be a linear increase. Using a two-sided test

with α = 0.05 and power= 0.80, what is the minimum detectable difference in the rate of

BMI change between men and women under these settings? Provide two plots showing how

the minimum detectable difference in slopes changes as you vary (a) ρ and (b) N . Clearly

describe all assumptions you make in order to carry out the power analysis.

7. To facilitate reproducible research, a single file containing all code to read in data (processing

code) and reproduce analysis (analytic code) should be uploaded to canvas. This code should

be clearly documented so that a colleague can run it without making any edits.

We are very thankful for Prof. Amy Herring from Duke University who provided the interesting

data set.

3

学霸联盟