STAT8721-无代写
时间:2024-06-01
STAT8721 Data Science GE
Essay
Due: 23:55, Friday 31st May, 2024 (Week 12)
_______________________________________________
PART A - Exploratory Data Analysis
Exploratory Data Analysis (EDA) is often performed when received a set of data. EDA is an
approach to explore data and get insights about the data. Simple graphical and non-
graphical tools are used. For example, histogram, bar chart, boxplot and scatterplot are
commonly used to summarize and visualize the data. Numerical quantities such as mean,
median, mode and standard deviation are often calculated. Many tools and methods maybe
used in EDA. In this Essay, we will limit to the tools and methods that we have learned in
this topic.
Essay (Part A): (maximum 500 words)
Write a short essay on EDA The essay should include the following 3 parts.
1) a short paragraph on EDA—context must be relevant to the syllabus of this topic. (tips: you
can explain e.g. What is EDA? Why EDA? How is it used? What insights it bring? …etc)
2) 2 examples of real-world application of EDA using graphical tools --must include your own
interpretations. (see Tips below)
3) References: in addition to including the EDA graphs of the 2 real-world examples in the
above, you must give details about the source of the examples (see Tips below), so that
readers can find the examples if they want to. You must also cite or refer to the source of any
information and any information that you quoted in relation to your writing. Harvard
referencing style should be used.
Tips:
• The 2 EDA examples: you can use examples found on the
internet or other sources. The 2 examples should be from
different sources/categories.
o E.g.: your water/electricity/gas...etc bill may have a chart
showing your usage (Is it a bar chart or histogram?
Appropriate? What insights?)
− You can include at most one bill (of all bills)
o E.g.: bar chart showing unemployment rate in here
o E.g. in journal publications (see Fig 1-3 in COVID EDA)
• For each of the 2 examples, you should
o In the body of the essay: include a copy of the graph (e.g. bar chart, histogram etc)
o identify what is the type of the graph (e.g. bar graph …etc); Justify if the use of the tool in
each example is appropriate? (e.g. categorical data, numerical data…etc)
o interpretation and discussion.
o reference the source of your examples using Harvard referencing style if material is
published online or printed. For water/electricity/gas bills etc, attach the original bill as an
appendix to the essay.
___________________________________________________
PART B - Simple Linear Regression
Data were collected from a random sample of nine people. For each person, the height (in
cm) and metacarpal bone length (in mm) were measured. The aim is to establish a linear
regression model for the prediction of human height from metacarpal bone length. Table 1
shows the data. It is given that the 2 variables are linearly related and linear regression is
appropriate without transforming the data. Metacarpal bones are visualized in Figure 1.
Explore the data and establish the linear regression model (i.e. finding the linear regression
equation to describe the relationship).
Table 1. Height and metacarpal bone length in the sample of nine people
Metacarpal bone length
(mm)
Height
(cm)
45
51
39
41
48
49
46
43
47
171
178
157
163
172
183
173
175
173
Figure 1. Illustration of Metacarpal bones. (‘Metacarpal bones’, 2013)
References
‘Metacarpal bones’ (2023). Wikipedia. Available at https://en.wikipedia.org/wiki/Metacarpal_bones
(Accessed: 18 June 2023)
Essay (Part B): (maximum 300 words)
Write a report on the above Simple Linear Regression study. You must include all relevant plots,
tables and figures and must include appropriate table/figure captions. The emphasis in PART B is
good practice in performing Simple linear regression in Data Science as well as communicating
results and findings effectively. The report should include the following sections.
o Aim (of the study)
o Introduction
o Exploratory Data Analysis (EDA)
e.g. scatterplot (and its interpretation), correlation coefficient and its interpretation
− From the above EDA, is it justified to perform the linear regression procedure?
− If yes, perform the linear regression procedure in below
o Performing the Linear regression
− If appropriate, perform the linear regression procedure and provide the results
(Excel outputs)
− Show scatterplot with the linear regression line
− Write down the linear regression equation
o Interpretation of the linear regression equation
− For every mm increase in metacarpal bone length, what is the change in predicted
height?
− Give example calculation: For a person with metacarpal bone length 42 mm, what is
the predicted height?
o Conclusion: what is the take home message

*** end ***

essay、essay代写