QM2 -无代写-Assignment 2|学霸联盟

QM2 -无代写-Assignment 2

时间：2025-04-27

1

QM2 Assignment 2
Sem 1 – 2025

Brief – Subgroup Analysis
Some weeks ago, you were hired to consult CityScape, a smart city
consultancy interested in analysing citizen behaviour and
sustainability trends. In your previous analysis, you explored whether
the typical carbon footprint per person falls below the sustainability
benchmark of 25 kg CO₂ per day.

Now, the chief data analyst at CityScape has asked you to extend your analysis by focusing on
the impact of daily transport choices on citizens’ carbon footprints. Your new task is to
complete a technical report that includes the following components:
1. Subgroup Analysis of Carbon Footprint
Calculate and present point and interval estimates (i.e., confidence intervals) for both the
mean and standard deviation of carbon footprint for a set of subgroups of the ‘Mode of
Transport’ variable.
▪ Include an interpretation of these confidence intervals.
▪ Describe any patterns or insights you observe when comparing carbon footprints
across different transport modes (e.g., EV, Public Transport, Bicycle, Walking, Private
Car).

2. Does Average Carbon Footprint Statistically Differ by Transport Mode?
Conduct a hypothesis test to determine whether there are statistically significant
differences in the mean carbon footprint across the different transport modes.
Specifically:
• Is there a difference in mean carbon footprint between groups defined by mode of
transport (e.g., EV, Public Transport, Bicycle, Walking, Private Car)?
• If the result above is statistically significant, follow up with pairwise comparisons:
o Conduct a series of independent two-sample t-tests2 comparing carbon
footprints across transport mode pairs.
o Apply a Bonferroni correction (see next page) to control for Type I error due
to multiple comparisons.
Be sure to report assumptions, interpret results clearly, and explain the implications for
urban sustainability strategy.

1 The Chief Data Scientist has reminded you that the ANOVA style test is robust to departures from normality, especially
where the sample size is this large. So, you do not have to assess normality in this report. However, you will need to
determine if an ANOVA F-test or the Welch F-test is more appropriate. To do so you may assess the relevant sub-
sample statistics as well as a Levene's test (see tutorial 6) and explain your conclusion.
2 You may carry through whatever assumptions you have made in footnote 1 above.
o Conduct a parametric hypothesis test1 to evaluate whether mean carbon
footprint differs across these groups.
2

The Bonferroni correction
The Bonferroni correction is a method used to adjust the significance level (alpha level) of
statistical tests when multiple comparisons are made simultaneously.
In situations where you're conducting multiple hypothesis tests simultaneously, there's an
increased chance of obtaining at least one false positive result (Type I error) simply due to
chance. The Bonferroni correction adjusts the threshold for statistical significance to account for
this increased risk.
It divides the desired significance level by the number of comparisons to reduce the chance of
false positives. For example, if you're conducting 5 tests and want a 0.05 overall Type I error rate,
each test's significance level would be adjusted to 0.05/5 = 0.01 (so alpha is now 0.01 after the
adjustment, and you can compare your p-value to this directly). It is a conservative method but
helps control the overall Type I error rate.

PLEASE READ THE FOLLOWING CAREFULLY.
You will need to re-form your groups for Assignment 2.
Further instructions are provided on the CANVAS page.
In this assignment:
• Make sure you read the whole assignment.
• Use the same dataset provided in assignment 1.
• Use a level of significance of 0.05 for all calculations. In
applying the Bonferroni correction above, start with a
significance level of 0.05 and divide this value by the
number of t-tests you are conducting.
• You do not need to manually calculate statistics in this assignment. Do all calculations in
R using relevant packages and functions. For hypothesis tests please use the p-value
approach.
• State assumptions and list all necessary steps and decision rules. Present analysis in
language suitable for a technical audience (the chief data scientist) but also provide a
non-technical summary of results.
• Make sure to present clean tables and format text neatly in keeping with what would be
suitable for your target audience (in this case your consultancy client).
• When deciding what to do, remember that this is an assessment, and you are encouraged
to show off all your relevant learnings from topics introduced in this course. Use R for all
calculations and provide all code used in your appendix.
Your task will be evaluated on a scale of TEN points, with an equal distribution of weight for data
setup, preliminary analysis, test selection, test execution, and overall presentation.
The TOTAL WORD LIMIT for your answers is 1000 WORDS excluding tables and
graphs. You are not required to use up the whole word limit. You may include any
code used in the Appendix. There is no word limit for the Appendix or Bibliography.
Feel free to ask clarifying questions on EdDiscussion.
GOOD LUCK!

学霸联盟