BUSI1125-Python/r代写
时间:2023-12-21
BUSI1125 Softwares and Tools for Data Analytics
INDIVIDUAL ASSIGNMENT
Autumn 2023/24
This individual assignment carries 100% of the total marks of this module.
Students are required to download 2 different datasets, and analyse each dataset using a
randomly assigned data analytics software.
Dataset 1 (poverty): Eradicating extreme poverty for all people everywhere by 2030 is a
pivotal goal of the 2030 Agenda for Sustainable Development. It has been recognised that
ending poverty must go hand-in-hand with strategies that build economic growth and address
a range of social needs including education, health, social protection, and job opportunities,
while tackling climate change and environmental protection. As a data analyst your objective
is to conduct an exploratory analysis to better understand the relationships/associations
between the level of income (outcome) and the selected socio-economic factors (features).
Dataset 1, extracted from the World Bank Development Indicators, includes the following
variables for 151 countries.
Variable Name Description
country Name of the country
region Region of the country
comp_edu Compulsory education, duration (years)
female_labour Ratio of female to male labour force participation rate (%)
agri_value_added Agriculture, forestry, and fishing, value added (% of GDP)
political_stability Political Stability and Absence of Violence/Terrorism: Estimated index
income_group Income group classification by the World Bank based on gross national
income (GNI) per capita (High income, Upper-middle income, Lower-
middle income, Low income)
Dataset 1 is available on the module Moodle page or download directly from:
https://raw.githubusercontent.com/mmchit/poverty/main/poverty.csv
Dataset 2 (wage): One of the other UN Sustainable Development Goals is about promoting
inclusive and sustainable economic growth, employment and decent work for all (Decent work
and Economic Growth). Decent work means opportunities for everyone to get work that is
productive and delivers a fair income, security in the workplace and social protection for
families, better prospects for personal development and social integration. As a data analyst
your objective is to conduct an exploratory analysis to better understand the
relationships/associations between the individual’s wage (outcome) and the selected
demographic factors (features).
Dataset 2, extracted from The United States National Longitudinal Surveys, includes the
following variables for 935 individuals.
Variable Name Description
wage Average weekly earnings (in US$)
hours Average weekly working hours
exper Years of working experience
age Age in years
marital Marital status (Married, Single)
gender Gender (Male, Female)
education Level of education (High School, College, Graduate, Post-Graduate)
Dataset 2 is available on the module Moodle page or download directly from:
https://raw.githubusercontent.com/mmchit/wage/main/wage.csv
Assignment requirements
Students are required to import the dataset and analyse with the assigned software (R or
Python). For descriptive and exploratory analytics and interpretations, students are required
to:
1. check data quality issues (missing values, data entry errors, inconsistencies, etc.),
perform necessary data cleansing, and briefly explain your data cleaning strategy.
2. identify the type of variables, provide appropriate summary statistics (all measures of
location and dispersion and frequencies) of each variables with appropriate
visualisations and interpretations.
3. identify the objectives of analytics based on the given dataset and scenario and identify
the relevant/appropriate relationships/associations between the outcome and feature
variables, conduct exploratory analysis with appropriate visualisations, and present
and interpret the analyses (based on DIKW pyramid).
4. write up a data analytics report with clear and effective communication.
The 1500-word assignment should include the following two sub-sections.
• Section 1: Report of descriptive and exploratory analytics of Dataset 1 using the
assigned software with appropriate visualisations, and interpretations (around 750
words)
• Section 2: Report of descriptive and exploratory analytics of Dataset 2 using the
assigned software with appropriate visualisations, and interpretations (around 750
words)
Students are also required to submit R-scripts and Jupyter Notebook files via Moodle
submission box.
Deadline Date for Submission of Coursework
Your coursework needs to be submitted electronically via the Module Moodle page. See the
Student Services website and the programme handbook for further details of this process.
The deadline for coursework submission is 3:30pm on Wednesday, 27th of December
2023. Late submission will attract marks deduction penalty unless an extension has been
approved by Student Services. Please familiarise yourself with the extenuating circumstances
policy and process for submitting a claim.
Five marks will be deducted for each working day (or part thereof) if coursework is submitted
after the official deadline without an extension having been obtained. Except in exceptional
circumstances, late submission penalties will apply automatically unless a claim for
extenuating circumstances is made before the assessment deadline.
Coursework Submission Requirements:
A maximum word count of the assignment is 1500 words and must be adhered to.
The penalty for exceeding this limit is a five mark deduction for exceeding up to 300
words, 10 marks deduction for exceeding between 301 and 500 words, and 15
marks reduction for exceeding over 501 words.
• The actual word count of the assignment must be stated by the student on the first
page (cover sheet) of the assignment.
• The overall word count does include citations and quotations.
• The overall word count does not include the references or bibliography at the
end of the coursework.
• The word count does not include figures and tables with numeric values and the titles
of figure and table. Any statement, interpretation, and explanation presented in
a figure or a tabular form will be included in the overall wordcount,
• Appendices (mostly supporting materials that are not directly related to the assignment
and will not be considered in marking) are not included in the overall word count.
Students should prepare and submit their coursework assessments via Moodle in
the following format:
Font: Verdana 11 point
Spacing: 1.5 spaced
Margins: Normal (2.5 cm)
Referencing: Harvard citation style
Plagiarism will not be tolerated. Please consult the Business School Undergraduate Student
Handbook for more guidelines on how to present and submit your essays. It is the strong
advice of the Business School that you should avoid plagiarism by engaging in ethical and
professional academic practice.
In accordance with the University’s Quality Manual, in normal circumstances, marked
coursework and associated feedback will be returned to you within 15 working days of the
published submission deadline. Therefore, students submitting work before the published
deadline should not have an expectation that early submission will result in earlier return of
work. Where coursework will not be returned within 15 working days for good reason (for
example in circumstances where a student has been granted an extension, illness of module
convenor, or lengthy pieces of coursework), students will be informed of the timescale for the
return of the coursework and associated feedback.
Additional circumstances where coursework may not be returned within 15 working days for
good reason can include the University closure dates. Therefore, where this applies, you will
be informed in advance of the date coursework feedback will be provided to you.
Assignment Marking Rubrics
Section Weight Trait Unsatisfactory Adequate Satisfactory Good Excellent Outstanding
Mark allocated in
Turnitin
25 45 55 65 80 100
Data quality and
cleansing (20%)
10% Identification of the
data quality issues
(missing values,
data entry errors,
inconsistencies).
Inadequate or weak
identification of the
data quality issues.
Adequate
identification of the
data quality issues
but some limitations.
Satisfactory
identification of data
quality issues.
Good identification
of the data quality
issues with
appropriate
explanations.
Excellent Identification of
the data quality issues
with clear explanations.
Outstanding analysis,
identification, and
explanation of the data
quality issues.
10% Cleansing of
dataset.
Inadequate or weak
data cleansing;
incomplete/
inappropriate data
cleansing strategies.
Adequate data
cleansing; but limited
discussion of
appropriate data
cleansing strategies.
Satisfactory data
cleansing with
discussion of
appropriate data
cleansing strategies.
Good data cleansing
with discussion of
appropriate and
clear data cleansing
strategies.
Excellent data cleansing
and discussion of useful
and relevant data
cleansing strategies
including relevant data
transformation.
Outstanding and perfect
data cleansing supported
by clear discussion of
appropriate and relevant
data cleansing strategies
including meaningful data
transformation.
Visualisations and
presentations of
summary statistics (35%)
5% Identifies types of
variables.
Inadequate or weak
identification of types
of variables in the
datasets.
Adequate
identification of
types of variables in
the datasets.
Satisfactory
identification of types
of variables in the
datasets.
Good identification
of types of variables
in the datasets with
some meaningful
discussion based on
the assignment
requirements.
Excellent identification of
types of variables in the
datasets with excellent
understanding of the
outcome and features
based on the assignment
requirements.
Outstanding identification
of types of variables in the
different datasets with
outstanding
comprehension of the
outcome and features
based on the assignment
requirements.
15% Provides summary
statistics.
Inadequate or
inappropriate summary
statistics; important
dimensions of summary
statistics are missing.
Adequate summary
statistics that
summarise most
information provided
in the data using
appropriate
methods.
Satisfactory summary
statistics that
summarise
information provided
in the data, including
measures of location
and dispersion or
frequencies.
Good summary
statistics that
summarise
information provided
in the data, including
meaningful
discussion .
Excellent summary
statistics that summarise
information provided in
the data, including
meaningful and
appropriate discussion
and interpretation
Outstanding summary
statistics that summarise
information provided in
the data, including clear
and meaningful discussion
and interpretation.
15% Provides
visualisations and
interpretations of
summary statistics.
Inadequate or weak
visualisation of data;
does nothing to assist
in summarising the
data.
Adequate
visualisation of
summary statistics
that communicates
pertinent
information.
Satisfactory
visualisation of
summary statistics
that clearly
communicates
pertinent information.
Good visualisation of
summary statistics
that clearly
communicates
pertinent
information.
Excellent visualisation of
summary statistics that
extract the relevant
information and
efficiently communicates
and interprets pertinent
information of data.
Outstanding visualisation
of summary statistics that
extract the relevant and
insightful information and
efficiently communicates
and interprets pertinent
information of data.
Identify the relationship/
association between the
variables, conduct
exploratory analysis with
appropriate
visualisations, and
10% Conduct exploratory
analysis.
Inadequate or weak
exploratory analysis;
few or no patterns
and/or insights
extracted.
Adequate
exploratory analysis
that discovers some
meaningful patterns
in the data.
Satisfactory
exploratory analysis
that discovers
patterns and extracts
some insights from
data.
Good exploratory
analysis that
discovers patterns
and extracts
meaningful insights
from data.
Excellent exploratory
analysis that discovers
relevant patterns and
extracts meaningful
insights as per
assignment
requirements.
Outstanding exploratory
analysis that discovers all
relevant patterns and
extracts useful and
important insights and
knowledge as per
assignment requirements.
present and interpret the
analyses (40%)
15% Visualisations of
exploratory
analyses.
Inadequate,
inappropriate, or weak
visualisations.
Adequate
visualisations, which
convey some insights
from data.
Satisfactory and
relevant visualisations,
which convey some
insights from data.
Good and
appropriate
visualisations, which
convey most insights
from data.
Excellent visualisations,
which convey important
insights from data.
Outstanding visualisations,
which convey major
insights from data clearly
and accurately.
15% Interprets
exploratory analysis.
Inadequate or weak
interpretation of
exploratory analysis.
Adequate
interpretation of
exploratory analysis
that helps extract
some knowledge
from information.
Satisfactory
interpretation of
exploratory analysis
that helps extract
useful knowledge
from information.
Good interpretation
of exploratory
analysis that helps
extract useful
knowledge from
information as per
assignment
requirements.
Excellent interpretation
of exploratory analysis
that helps extract useful
and appropriate
knowledge and
information from data as
per assignment
requirements.
Outstanding interpretation
of exploratory analysis that
helps extract useful
knowledge and
information from data and
transform them as wisdom
as per assignment
requirements.
Presentation and
communication (5%)
5% Quality of
presentation and
communication
Inadequate
presentation and/or
unclear communication
adequate
presentation but
communication is
not very clear
satisfactory
presentation and
communication is
relatively clear
Good presentation
with relatively clear
communication that
covers major insights
Excellent presentation
with clear
communication that
cover all insights as per
requirements
Excellent presentation with
clear communication and
story telling that cover all
insights as per
requirements
essay、essay代写