00PM ET-无代写
时间:2023-12-06
Final Project Part 3
Final Data Analysis Report
Due: December 6, 2023, by 6:00PM ET on Quercus
Latest Acceptance: December 13, 2023, by 6:00PM ET
Goal of the Assessment: Learning Outcomes being Assessed:
• Showcase your group’s
knowledge of the correct
usage of methods and
techniques from the course.
• Write a report that outlines
the research question, the
plan for answering it, the
results of the analysis, and
the conclusion and
limitations of the model.
• Experience the process of
conducting a complete linear
regression analysis on real
data.
• Think about the ethical
responsibility of a statistician
in the context of conducting
a data analysis
• Conduct a complete analysis starting from a
research question to model that reasonably
answers said research question.
• Communicate the motivation, process, and
results of an analysis in a written report.
• Report the results of a residual plot analysis
and recommend a course of action.
• Defend the decision to apply a transformation,
the choice of transformation, and whether the
violated assumption(s) is(are) adequately
corrected.
• Defend the choice of "best" model for a given
dataset and research question.
• Critique hypothesis test/confidence interval
conclusions from an analysis as it pertains to
violated assumptions.
• Appropriately discuss the ethical
considerations of opting in/out of using
automated selection methods.
General Instructions:
Using only methods and techniques presented in the lecture slides throughout the term, you
and your group are tasked with answering your proposed research question by creating the
‘best’ linear regression model that meets the requirements of your research question. You will
then need to write a report (details below) that:
(i) introduces your research question, presents some background, and
contextualizes your data analysis relative to the background information.
(ii) outlines the steps in your analysis that you will follow to reach the ‘best’ model,
(iii) presents the results of your analysis and describes and justifies the decisions you
have made, and finally,
(iv) discusses the final model, its interpretation, and its limitations in terms of its
ability to meet your research goals. It should be made clear whether you are
aiming for a model that makes good predictions (and so a more complicated
model may be appropriate), or a model that is more descriptive and easier to
interpret, or some combination of both.
The feedback and work you have put into Part 1 of the final project should provide you with a
good beginning to your introduction section as you’ve already begun contextualizing your
research question. You may want to consider adding some additional background research or
more discussion about how your research question is important and different from the
background you present. The table of numerical summaries from Part 1 should also be helpful
in writing the beginning of the results section, where it is necessary to display the
characteristics of the data you will use to answer your question.
The feedback and work you have put into Part 2 of the final project should help you structure
the methods section of your report, where you will outline the process you will/have
follow(ed)/tools and methods you will/have use(d) to answer your research question. The
feedback should also help you with how you approach the data analysis itself.
How to present your final report:
Once you group has decided upon the ‘best’ model to fulfill the goal of the project, you must
write up a short scientific report. There should be six main sections of your report:
• Contributions: where each group member’s name is listed and a description of their
contributions to the final project is outlined (this does not count towards the word limit)
• Introduction section: where you introduce the purpose and relevance/importance of
the project, provide some relevant background information on the topic, and discuss
how your analysis differs from the background literature (no results or data should be
presented here).
• Methods section: where you describe and explain the methods, tools and techniques
used to arrive at your final model (no results or data should be presented here, but you
can tell us where you found your data and what variables it contains).
• Results section: where you present a numerical/graphical description of your study
sample and important results that led you to make crucial decisions in building your
model (following the methods you outline in the earlier section), followed by the final
model and any other important results.
• Discussion section: where you interpret your final model and describe why it answers
the research question and why it is important, as well as discuss any limitations that still
exist based on your results.
• Ethics discussion: If you chose to use automated selection methods in your analysis,
explain why you did not use manual selection methods. If you chose to use manual
selection methods, explain why you did not use automated selection methods. Were
the two methods ethically the same, and you chose one for purely practical reasons? (If
so, make sure to explain why you think they are ethically the same.) Or did you think
that one was more ethical than the other (e.g., because using the other would have
been negligent or reckless?) In either case, use some of the material discussed in the
second ethics module to defend your answer.
You may use tables and plots to help present your results, but they must be relevant and well-
thought-out to convey as much information as possible without being too overwhelming or
confusing. When explaining your methods, try to avoid just stating that you used a specific
method, but add an explanation for how it is used to achieve a specific task. When presenting
your results, avoid repeating exactly what you wrote in your methods section. Instead, focus on
the important results of the process you described earlier, and use numerical values/graphical
results to support the decisions you made in arriving at your final model. See the rubric for
more information regarding the various report components.
Technical Requirements of the Final Report:
Your report should be typed using whatever software you prefer but must be saved and
submitted as a PDF file on Quercus. Your report must meet the following requirements:
• Font: 12-point font in a style like Times New Roman (this is the default in R Markdown)
• Spacing: single-spaced
• Word count: up to a maximum of 2000 words in total for Introduction + Methods +
Results + Discussion (this does not include captions on figures and tables, or
contributions). This is a strict word limit.
o The ethics discussion should be between 200 and 250 words (separate from the
word count above).
• Number of tables/figures (combined) in the main report: 5 in total, but you may use
any combination of tables and figures and figures may include multiple plots that share
something in common.
• Figures and table captions: all figures and tables included should have a caption that
describes what is being presented (caption not included in the word count).
o Captions should not contain information that is not also discussed in the main
report.
• Figure properties:
o All plots should have an appropriate title and axis labels, avoiding the use of
variable names as they appear in the dataset.
o A figure may include multiple individual plots, but they should be related to each
other and make sense as to why they are being presented together.
▪ Avoid having too many plots in the same figure to ensure that they are
legible and clear.
• Reference list or bibliography at the end of the report (will not count towards word
count), using appropriate citation style.
• Appendix: you may add an appendix at the end of your report to include some
additional tables or figures that were not important enough to be part of the main
report, but still relevant to your analysis:
o up to 3 additional tables/figures but they should only be included if they are
relevant to the analysis and are referred to in the main text.
• R code: In a separate file (i.e., RMD file), you should upload your cleaned and complete
version of the R code that was used to conduct your analysis. The R code should be well-
organized and commented appropriately to indicate what each line/section of code is
doing.
o Your report should not display any R code or R output. Instead, create your own
tables to display the results of R functions in a more concise way.
Checklist for submitting Final Project Part 3:
1. Your final written report which follows the requirements above.
2. Your R code that shows your complete analysis (this will be used to verify the results
displayed in your written report and will not be assessed for content).
3. Your dataset that is loaded in at the start of your RMD (in .csv file format). If your
dataset is large, you may also store it on the cloud and provide us a shareable link as a
submission comment.
** Note that since the report should be submitted as a PDF, you should try to knit your R
Markdown as a PDF instead of a docx file. If you are writing your report outside of R Markdown,
be sure to convert your final draft to PDF before uploading to Quercus. The rubric will only be
applied to the PDF of your report, so any details that are written in your R Markdown document
will not be graded.
Things to keep in mind while writing your final report:
o You do not need to write out the results of every step you took in your analysis as this
will make your report too long.
o Instead, focus on summarizing the most important results, especially where a big
decision was made. You need to justify any big decisions.
o For the rest of your results, very short mentions of the process with a brief piece
of evidence provided are enough to allow your reader to follow your analysis and
understand how you arrived at the final model.
o Rather than presenting the results of each step separately (e.g., creating separate tables
for each), consider putting together one larger table that you can refer to in your
discussion of many steps in your analysis so that you don’t use too much space.
o For example, if you are selecting between a few different models, you could
consider presenting a table that includes many different summaries of the fit of
each model and refer to each part as needed in the text, instead of making
individual tables for each component.
o Avoid using R output taken directly from R/RStudio. Instead create your own tables
where you select only the relevant pieces of the output to display.
o Generally, the methods and results sections tend to be the longest sections, while the
introduction and discussion tend to be shorter.
o Keep this in mind when deciding how much background to provide in your
introduction. Often just a paragraph or two is plenty, given the word limits in this
project.
o However, make sure you leave yourself enough space for a solid discussion
where you can discuss the impact of the limitations that may exist in your model.
Resources:
o Report structure: this cheat sheet and this outline for reports (you may ignore the
abstract portion since you do not need one) provide examples of what information does
into each section of a report. Note that not all the elements in these resources need to
be included in your report. But you can use these to better understand how to structure
your submission.
o Reference formatting: You may follow APA citation styles to help format your
references. For some resources on how to cite, see the library page on citations. At
minimum, you should reference the 3 papers you discussed in Part 1 of the Final Project.
o Adding captions and other plotting features in RMarkdown
o Exporting plots in RStudio
o Workshops: we will be holding two workshops to help you prepare your final report:
one on how to write your first draft and format your report, and one on how to edit
your report to satisfy the word limit requirement. These will be held during class time
after the reading break.
o Writing Centre on campus (book this early if you wish to use it)
essay、essay代写