FIT1006-无代写-Assignment 1
时间:2024-03-23
Monash University Faculty of Information Technology 1st Semester 2024
FIT1006 Business Information Analysis
Assignment 1: Data Collection and Preliminary Data Analysis
This assignment is worth 18% of your final mark (subject to the hurdles described in the
FIT1006 handbook entry, FIT1006 Moodle preview [or Unit Guide] and links therein).
Among other things (see below), note the need to hit the `Submit’ button (and the possible
requirement of an interview).
Due Date: Thursday 28th March 2024, 11:55 pm
Method of submission: Your submission should consist of 1 file:
1. A text-based .pdf file named as: FamilyName-StudentId-1stSem2024FIT1006Asst1.pdf
The file must be uploaded on the FIT1006 Moodle site by the due date and time.
The text-based .pdf file will undergo a similarity check by Turnitin at the time you submit
to Moodle. If you have any relevant output from MicroSoft Excel and/or from SYSTAT then
make sure to include that in the appropriate place(s) in your .pdf file. Please read submission
instructions here and elsewhere carefully regarding the use of Moodle.
Total available marks: 9 + 5 + 16 + 16 + 11 + 16 + 16 + 11 = 100 marks.
Note 1: Please recall support, conferring with https://www.monash.edu/student-academic-
success, the Academic Integrity rules and the `Welcome to FIT1006’ post in Ed Discussion.
This is an individual assignment.
In submitting this assignment, you acknowledge both that you are familiar with the relevant
policies, rules and regulations regarding Academic Integrity (including, e.g., doing your own
work, not sharing your work, not using ChatGPT in particular, not using generative AI at all)
and also that you are familiar with the consequences of being deemed to be in contravention
of these policies.
Note 2: And a reminder not to post even part of a proposed partial solution to a forum or
other public location. This includes when you are seeking clarification of a question.
If you seek clarification on an Assignment question then – bearing in mind the above – word
your question very carefully and/or (if necessary) send private e-mail. If you are seeking to
understand a concept better, then try to word your question so that it is a long way removed
from the Assignment. You are reminded that Monash University takes academic integrity
very seriously.
Note 3: As previously advised, it is your responsibility to be familiar with the special
consideration policies and special consideration process – as well as academic integrity.
Students should be familiar with the special consideration policies and the process for
applying.
Note 4: As a general rule, don’t just give a number or an answer like `Yes’ or `No’ without
at least some clear and sufficient explanation - or, otherwise, you risk being awarded 0
marks for the relevant exercise. Make it easy for the person/people marking your work to
follow your reasoning. Without clear explanation, there is the possibility that any such
exercise will be awarded 0 marks.
Re-iterating a point above, for each and every question, sub-question and exercise, clearly
explain your answer and clearly show any working.
Note 5: All of your submitted work should be in machine readable form, and none of your
submitted work should be hand-written.
Note 6: If you wish for your work to be marked and not to accrue (possibly considerable)
late penalties, then make sure to upload the correct files and (not to leave your files
as Draft). You then need to determine whether you have all files uploaded and that you are
ready to hit `Submit’. Once you hit `Submit’, you give consent for us to begin marking your
work. If you hit `Submit’ without all files uploaded then you will probably be deemed not to
have followed the instructions from the Notes above. If you leave your work as Draft and
have not hit `Submit’ then we have not received it, and it can accrue late penalties once the
deadline passes. In short, make sure to hit ‘Submit’ at the appropriate time to make sure that
your work is submitted. Late penalties will be as per Monash University Faculty of IT and
Monash University policies (see, e.g.,
https://publicpolicydms.monash.edu/Monash/documents/1935752) and, e.g., sec. 1.11). It is
expected that any work submitted at least 10 calendar days after the deadline will
automatically be given a mark of 0.
Note 7: Save your work regularly.
Some Questions and Answers – further to the above
What help am I entitled to have with this assignment?
Academic integrity is an important concern. As such, you must write your work yourself,
without collaborating with other students nor anyone else – nor using generative AI (e.g.,
ChatGPT). This includes doing your own reading of any references.
Are there any other matters that relate to academic integrity?
Yes. You must be honest in reporting the results.
Introduction
All statistical analyses are based on some form of data which has been collected via some
kind of specified process. The quality and relevance of the data which is collected influence
the outcome of any statistical analysis. A well planned and executed data collection exercise
is thus a key part of ensuring that any data analysis is appropriate.
In this assignment, you will gain some initial experience with data collection. This should
partly help you understand the quality of datasets that you encounter in the future.
Ethics is an important part of data collection. Where surveys are done and where data is
otherwise collected, there is often an extensive ethics process in place before data can be
collected. Here, for this Assignment, your data collection should simply require accessing –
and copying from – documents, any should not require any redaction that we are aware of.
This assignment is worth 18% of the overall mark for FIT1006.
Data can come from a variety of sources.
This paper J M Betts, D L Dowe, D Guimarans, D D Harabor, H Kumarage et
al. https://link.springer.com/chapter/10.1007/978-3-030-30048-7_43 (2019) concerns public
transport.
This paper K Faksova et al. (2024), “COVID-19 vaccines and adverse events of special
interest: A multinational Global Vaccine Data Network (GVDN) cohort study of 99 million
vaccinated individuals” https://pubmed.ncbi.nlm.nih.gov/38350768 in the journal Vaccine,
has been discussed in "World's largest study in COVID vaccine side-effects - Health with Dr
Norman Swan" (Broadcast Mon 26 Feb 2024 at 8:15am) on ABC Australia radio in the
Health report: https://www.ABC.net.au/listen/programs/radionational-breakfast/health-with-
dr-norman-swan-covid-vaccine-side-effects-study-/103510350
A venue for publishing scientific data is https://www.nature.com/sdata/research-articles .
Throughout this Assignment, recall all notes and instructions.
----
Qu 1 (3 + 3 + 3 = 9 marks)
Go to https://scholar.google.com and search on exactly one of the following authors:
Elizabeth H Blackburn
Elizabeth L Scott
Eliyathamby A Selvanathan
but make sure that this author has at least 10 papers.
Alternatively, from https://scholar.google.com, search on articles about Blackstone’s ratio
by using Blackstone’s ratio as your search term. But, again, make sure that your search
returns at least 10 results.
Make sure to document all your work so that the marker or other reader can follow.
(a) For each of the first 10 articles that you see (whether by this author or by using this
search term), record the number of citations – e.g., if it says ``Cited by 77’’ then
record the number 77.
(b) Arrange these 10 numbers in order from lowest to highest.
(c) Calculate the median, explaining and showing your working.
----
The data we now analyse is from the probabilistic footy-tipping https://probabilistic-
footy.monash.edu/~footy competition for the years 2014-2023. We will not look at the
predictions and results of primary school students nor secondary school students (i.e., we do
not consider the predictions and results of entrants marked with a green S).
The data that you will look at will depend upon the last digit of your StudentId.
Last Digit Of Student Id Year Last Digit Of StudentId Year
0 2020 5 2015
1 2021 6 2016
2 2022 7 2017
3 2023 8 2018
4 2014 9 2019
This will be determined as above. For the relevant year, so as to avoid primary and
secondary school students, go to the section marked (at left) `Without Students’.
Now consider the 3rd last and 2nd last digits of your StudentId.
These will give you a number from 1 to 100.
Go to the probabilistic competition, and look at the person who is in the position after round
22. Go to Round 22. So, as one example, if the 3rd last and 2nd last digits of your StudentId
are 7 5 then find the person who is 75th after Round 22. As another example, if the 3rd last
and 2nd last digits of your StudentId are 9 9 then find the person who is 99th after Round 22.
As another example, if the 3rd last and 2nd last digits of your StudentId are 00 then find the
person who is 100th after Round 22. If your number is greater than 50 and your entrant has
many scores of 0, then choose the person 50 places closer to the top – e.g., if the 3rd last and
2nd last digits of your StudentId are 7 5 but the person who is 75th after round 22 has many
scores of 0 then instead consider the person who is 25th. If your entrant still has many scores
of 0 then move 1 position closer to the top (e.g., from 25th to 24th) until this problem no
longer persists. If the problem still persists then re-read the instructions. If you believe that
the problem still persists then approach the teaching team at the first viable opportunity.
Qu 2 (3 x 1 + 2 x 1 = 3 + 2 = 5 marks)
Remember to be in the section `Without Students’. State your StudentId, the year and
name (e.g., Richard L. Farmer) and their competition name (e.g., Dairyf) of this person.
If this person (call them Player1) is a primary school student or secondary school student
(marked by a green S), then you have gone to the wrong section – you should be in the
section `Without Students’.
Choose both this person (Player1) and the person immediately above them after round 22.
For the person immediately above them (call this person Player2), also state this person’s
name and their competition name.
Alternatively, if Player1 is at the top and so there is no-one above Player1 then choose
Player2 to be the player immediately below Player1.
Giving a correct and well-documented answer to this question is very important. If you make
a mistake here and/or do not document your answer here well then it could have a substantial
adverse effect on your marks to later questions.
----
----
Qu 3 (1 + 1 + 2 + 8 x 0.5 + 1 + 0.5 + 0.5 + 3 + 3 = 16 marks)
For Player1, record their results for ``This Round’’ for each round from round 1 to the end of
the season (i.e., the last round played in the year).
(a) Write the numbers in the order that they occurred chronologically.
(b) Write the numbers in order from lowest to highest.
(c) Show the numbers in a stem-and-leaf plot.
Calculate each of the following, showing all your working:
(d) minimum value,
(e) maximum value,
(f) mean (equivalently, arithmetic mean),
(g) median,
(h) mode,
(i) 1st quartile,
(j) 3rd quartile,
(k) interquartile range.
(l) With calculations, show any and all outliers.
(m) With calculations, show the sample standard deviation.
(n) Calculate also the 5% trimmed mean, showing all working.
(o) Give a histogram of the results pertaining to Player1.
(p) Give a boxplot of the results pertaining to Player1.
----
Qu 4 (1 + 1 + 2 + 8 x 0.5 + 1 + 0.5 + 0.5 + 3 + 3 = 16 marks)
Do the exercises in Qu 3, but do them for Player2 (whereas Qu 3 was for Player 1).
----
Qu 5 (11 marks)
Comment on the data collection process for Qus 2, 3 and 4.
Using at least your answers to Qu 3 and Qu 4, compare the distributions of the scores of
Player1 and Player2.
Interpret the results.
Marks will be given here and throughout for clear explanation based on your documentation.
----
----
Qu 6 (1 + 1 + 2 + 8 x 0.5 + 1 + 0.5 + 0.5 + 3 + 3 = 16 marks)
For Player1, for each of the first 22 rounds (i.e., rounds 1 to 22), record the probability that
they gave to the team Melbourne in the game which Melbourne played.
If they chose Melbourne, then this number should be in the range [0.5, 0.999].
If they chose Melbourne’s opponent, then this number should be in the range [0.001, 0.5].
More specifically, if Player1 chose a probability of p for Melbourne’s opponent, then Player1
has implicitly chosen a probability of 1-p for Melbourne. As an example, if Player1 chose a
probability of 0.62 for Melbourne’s opponent then Player1 has chosen a probability of 0.38
for Melbourne.
If Player1 entered no tip for the game involving Melbourne, then the probability will be
regarded as 0.5.
If Melbourne did not play in a certain round, then document this but also record the
probability as 0.5.
Alternatively, if Player1 has chosen a probability of 0.5 for Melbourne on many occasions
then instead record the probability that Player1 gave to Geelong.
Alternatively, if Player1 has chosen a probability of 0.5 for Melbourne on many occasions
and if Player1 has also chosen a probability of 0.5 for Geelong on many occasions, then
consider the probability that Player1 gave to Richmond. From the teams Melbourne,
Geelong and Richmond, try to find a team where Player1 did not choose 0.5 very often. Then
stay with that team – and then consider the probability that Player 1 gave to that team for
each of round 1, round 2, …, round 21, round 22.
(a) Write the numbers in the order that they occurred chronologically.
(b) Write the numbers in order from lowest to highest.
(c) Show the numbers in a stem-and-leaf plot.
Calculate each of the following, showing all your working:
(d) minimum value,
(e) maximum value,
(f) mean (equivalently, arithmetic mean),
(g) median,
(h) mode,
(i) 1st quartile,
(j) 3rd quartile,
(k) interquartile range.
(l) With calculations, show any and all outliers.
(m) With calculations, show the sample standard deviation.
(n) Calculate also the 10% trimmed mean, showing all working.
(o) Give a histogram of the results pertaining to Player1.
(p) Give a boxplot of the results pertaining to Player1.
----
----
Qu 7 (1 + 1 + 2 + 8 x 0.5 + 1 + 0.5 + 0.5 + 3 + 3 = 16 marks)
Do the exercises in Qu 6, but do them for Player2 (whereas Qu 6 was for Player 1).
Whichever team you used in Qu 6 for Player1 (Melbourne or Geelong or Richmond), then
use the same team for Player2 here in Qu 7.
----
Qu 8 (11 marks)
Comment on the data collection process for Qus 6 and 7.
Using at least your answers to Qu 6 and Qu 7, compare the distributions of the probabilities
chosen by Player1 and Player2.
Interpret the results.
Marks will be given here and throughout for clear explanation based on your documentation.
Please recall and carefully re-read all notes and instructions at the start of the Assignment and
throughout the Assignment.
END OF FIT1006 ASSIGNMENT 1 (Monash University, 1st semester 2024)
essay、essay代写