无代写-1STAT5002
时间:2022-11-03
1STAT5002: Introduction to Statistics - Semester 2, 2022
Submission Due Date: Friday, 4nd November, 2022 (Week 13) before 11:59 pm (Sydney
time)
Instructions:
1. You are required to type up your entire assignment, including any equations. If you are using Word,
you should use the equation editor for any maths notation.
2. Copy and paste relevant R code and outputs while discussing your answer in the text.
Do not put all R code and outputs at the end of the document.
3. Answer all questions in the given order; i.e., 1(a), 1(b), etc. Keep your answers clear, brief, and concise.
4. Convert and submit your assignment in pdf, which must be uploaded to the Turnitin assignment box
on Canvas.
5. Data used in this assignment are in the spreadsheet ADataset.xlsx.
6. You MUST write up solutions on your own. Students caught cheating will automatically
receive a mark of 0 and are subject to disciplinary action.
7. This assignment carries a weight of 8% towards your final mark for STAT5002.
Researchers from Baystate Medical Centre in Massachusetts were interested in identifying risk factors
associated with giving birth to a low birth baby (weighing less than 2500 grams). The main risk factors
of interest were the mother’s smoking status, age, the presence of uterine irritability, and age. The main
aim of this assignment is to analyse the data using several hypothesis tests and regression.
The columns of the file contain the following information:
Column Name Description
C1 LowWt Low Birth Weight
(= 1 if Birth Weight < 2500g; = 0 if Birth Weight ≥ 2500g;)
C2 BirthWt Birth weight in grams
C3 Age Mother’s age in years
C4 MotherWt Mother’s weight prior to pregnancy in kg
C5 Smoke Mother’s smoking status during pregnancy
(= 1 if yes; = 0 if no)
C6 UterIrr Presence of uterine irritability
(= 1 if yes; = 0 if no)
1. The researchers were interested in testing the hypothesis that the birth weight (BirthWt) for these
babies is significantly greater than 2500 grams. Include mention of H0 and H1, the observed value
of the test statistic, the p-value, the decision, and a conclusion.
2. Conduct a hypothesis test to compare whether there is a difference between the smokers and non-
smokers (Smoke) in terms of their baby’s birth weight (BirthWt). Include mention of H0 and H1,
the observed value of the test statistic, the p-value, the decision, and a conclusion.
3. We would like to predict the birth weight (BirthWt) with a set of independent variables.
(a) Regress BirthWt on Age, MotherWt, Smoke, UterIrr and write down the estimated multiple
linear regression equation.
(b) Interpret the slope coefficient associated with Age and Smoke.
(c) Find and interpret the coefficient of determination.
(d) Conduct a hypothesis test to determine whether or not the model is useful. Include mention of
H0 and H1, the observed value of the test statistic, the p-value, the decision, and a conclusion.
(e) Identify all the independent variables which are significant at the 5% significance level.
2(f) Perform backward elimination by using step() function in R and BIC criterion to attain the
best subset of independent variables to predict BirthWt
(g) Perform the residual diagnostics on the final model.
4. We would like to know if there is any relationship between the mother’s smoking status and having
a pregnancy resulting in a birth weight below 2500kg.
(a) Construct a contingency table that shows the number of women who have just given birth
grouped by low birth weight baby (LowWt) and smoking status (Smoke) .
(b) At the 0.05 level of significance, is there evidence of a significant association between smoking
status of the mother and a baby of low birth weight? Include mention of H0 and H1, the
observed value of the test statistic, the p-value, a decision, and a conclusion.
(c) What are the odds of a woman who smoked having a low birth weight baby?
(d) What are the odds of a woman who did not smoke having a low birth weight baby?
(e) What are the odds of a women who had a low birth weight baby being a non smoker?
(f) What is the odds ratio for women who did and did not smoke having a low birth weight baby?
(g) Fit a logistic regression of normal birth weight on smoking status. Treat no smoke as a base
group. Write down the fitted logistic regression equation.
(h) Refer to part (g). What is the odds ratio for women who did and did not smoke? Is it the
same as your calculation in part (f)? Explain why or why not.

学霸联盟
essay、essay代写