BSAN2204-无代写
时间:2024-04-07
BSAN2204 - Project Report (A1)
Briefing Notes
Dr. Thomas Magor
Background
This first project report will provide an introduction to the dataset the MSD (“Million Song Dataset”) and
demonstrate your ability to use the basic methods of business analytics introduced in the first half of the
semester (data visualisation, descriptive statistics and a basic linear regression model).
To complete the task, you will need to use R to generate basic summary information about the dataset,
visualisations and a regression model. The report should neatly organise these outputs and provide some
basic interpretations of. The report should not describe or include any of the R code used (only the output).
The Million Song Dataset (MSD)
The Million Song Dataset (MSD) includes metadata on 1 million songs, including their musical features,
titles, year of release as well as listeners ratings for each song. To help contextualise your project, you are
encouraged to work with a specific client/business in mind. . . the specific context you choose is not itself
assessable (there is no right or wrong context per se). What is more important is that you demonstrate an
understanding of the data, and an ability to use the methods of business analytics taught in class with this
data.
Suggested structure for the Project Report (A1)
The Project Report (A1) should introduce the Million Song Dataset (MSD) and provide descriptive visual-
isations of the data, descriptive statistics, plus a linear regression model that predicts at least ONE of the
following variables from the MSD (song hotness, artist hotness OR artist familiarity). You should organise
the first Project Report (A1) under the following section headings:
1. Introduction to the Million Song Dataset (MSD)
2. Visualisations
3. Descriptive statistics
4. Regression model
Introduction to the Million Song Dataset (MSD)
The dataset you will use is a subset drawn from a publicly available dataset. You should do some basic
background research on the Internet about the full MSD, compare it to the subset provided on BlackBoard,
and include this information in this section (be sure to appropriate cite your sources!). You should also use
this section to contextualise your project. This part does not need to be overly detailed, but it should provide
the reader at least a basic idea of what kind audience you are writing to.
1
Visualisations
Teaching Week 3 covers how to visualise data using basic graphs and univariate/bivariate displays. We do
not provide a prescriptive list of “must have” visualisations as this would lead to all the project reports being
cookie cutter copies! Instead, it is up to you to select a carefully curated range of visualisations which you
think best summarises the contents of the dataset and support the analyses you are running.
Descriptive statistics
Teaching Week 4 covers basic descriptive statistics. Again, there are no prescribed list of compulsory outputs
here, but it should be suffice to say you should include measures of central tendency for your chosen output
variable as well as basic statistics (counts, proportions) that describe the dataset more generally.
Regression model
Teaching Week 5 covers linear regression. In this section you must describe (and justify) the specification of a
linear regression model that uses the MSD. This entails listing the input variables (there should be multiple!)
and ONE output variable. The use of a model diagram or equation that summarises your specifications is
encouraged.
Include summarised output from a linear regression model you estimate in a neatly formatted table (do not
simply copy and paste from the R console!). Interpret the model coefficients and consider their possible
implications within your chosen contextualisation.
Submission Guidelines
The first Project Report is due in Week 7 (after the midsemester break). See BlackBoard/the eCP for exact
submission data and time.
The first Project Report is to be submitted as written report in PDF format. The total length of the report
should not exceed 3000 words. You may use any software to prepare (write) this report. The report must
not include any raw R output or R script.
In Week 6 (before the midsemester break), we will run an analysis workshop in which we focus on trou-
bleshooting applications of the previous weeks class content to the MSD.
Marking Criteria
The first Project Report will be marked against a marking rubric which includes banded performance criteria
for each section of the report, plus a criterion relating to the overall professionalism of your report. The
rubric is available on BlackBoard.
The professionalism criteria includes both your use of appropriate language, but also your use of document
formatting and adherence to the submission guidelines.