COMP5310 Project Stage 2A
Summarise and Analyse the Data
Due: 11:59pm on 6th of April 2023 (Week 7)
Value:
10% of the unit This stage is usually done with the same group members
as you worked with for Stage 1. However, if someone is currently in a
group that is not in their timetabled lab, they will need to move groups
to one in their timetabled lab. If this applies to you, please urgently
email Nazanin.borhan@sydney.edu.au to arrange moving to a different
group.
DISPUTE RESOLUTION If, during the course of the assignment work, there is a dispute among group members that
canǯ eleǡ ha ill imac gǯ caaci cmlee
he ak ellǡ need to inform the unit coordinator,
Nazanin.borhan@sydney.edu.au. Make sure that your email includes your
group number and tutorial session, and is explicit about the difficulty.
Also, make sure this email is copied to your tutor and all the members
of the group (including anyone you are complaining about). We need to
know about problems in time to help fix them and deal with
non-efmance ml ȋdnǯ ai until a few days before the work is
due to complain that someone is not delivering on their tasks). If
necessary, the unit
cdina ill li a gǡ and leae
anne h didnǯ aiciae effeciel in a group by themselves (they
will need to achieve all the outcomes on their own). This option is
only available up until Monday March 27th, which is the last day with
time to resolve the issue before the due date. For any group issues that
arise after this time, you will need to try to resolve the problem on
your own, and you will continue to be treated as a single group. If
mene
denǯ ide the material required for the report, or their material
is not of the agreed standard, you should still have the report show
what that person did. Their section
f he e ma be em if
he dnǯ dce anhingǡ i ma hae maeial b n enough. In
such caseǡ leae a DzNe makedz n he fn age f he
eǡ hich describes the circumstances. That way, we can consider how
best to apply the marking scheme. Note that it is not expected or
sensible for other members to do the work that someone failed to
deliver.
TASKS There are TWO individual tasks and ONE group task.
The tasks should be addressed in a report, identifying which group
member answered which sub-task.
INDIVIDUAL TASKS: 1. [4 marks] Each group member should answer ONE of these two sub-tasks using a
different
statistical technique. At least one person from the group must answer
each sub-task, but more than one person can answer the same sub-task
using a different statistical technique: a. Identify a statistical
technique that might be appropriate for summarisation and analysis of
your dataset. For that technique:
o Name and describe the technique.
o Outline the assumptions that are required for the technique to be valid.
o Describe to what extent the assumptions are true for your dataset.
Page 2 of 4
o
Justify your choice of technique in the context of the business
question. b. Identify a statistical technique that is clearly not
appropriate for summarisation and analysis of your dataset. For that
technique:
o Name and describe the technique.
o Outline the assumptions that are required for the technique to be valid.
o Describe what assumptions are violated in your dataset.
o Justify why this technique is not appropriate for your dataset.
o
Propose whether the data can be transformed in a way that makes the
assumptions true and justify whether this is appropriate or not in the
context of your business question.
NOTE: When justifying your conclusions, consider for example whether the technique
requires too many assumptions that are only partially true, or might make your
conclusions too unreliable to apply in your business context. Also consider the cost of
making
a Type I error, and the cost of a Type II error in your business
context. 2. [2 marks] Each individual should create one chart that
visualises some aspect of the dataset that informs your understanding of
the data and research question. Describe what conclusions you draw from
the chart, and what questions it raises that you could answer in Stage
2B.
GROUP TASK: 1. [4 marks] Answer the following questions as a
group: a. Describe any exploratory analysis you have undertaken to
refine your understanding of the data and research question, the
strengths and limitations of the exploratory analysis you undertook
compared to at least one alternative, and justification for the analysis
you undertook. b. Propose an approach (a particular classifier model,
hypothesis test, etc) that you might take to solving your research
question in Stage 2B, and any limitations or strengths of the approach
compared to at least one other approach and justify your choice of
approach. c. Outline, at a high level, how you will validate the
approach, the strengths and limitations of the validation techniques you
chose compared to at least one alternative method and justify your
choice of validation techniques.
WHAT TO SUBMIT There are TWO deliverables in this stage of the project, and both should be submitted by
ONE
PERSON on behalf of the whole group. 1. A written report on your work,
as a PDF document. There is a maximum length for the report of 1500
words for groups of 2 and 2000 words for groups of 3. The report should
have a front page, that gives the group name and lists the members
involved (giving their SID and unikey, not their name), and then the
body of the report should include a section for each group member (the
section should state the SID/unikey of the group member who did the work
reported in this section), answering the questions from the sub-task
they selected, and finally a section where the group provides the
answers to the group questions. 2. The code and dataset that you used
to produce the analysis and charts in your report.
Page 3 of 4
This should be submitted as a single zip or tar.gz file which contains a subfolder for each group member.
MARKING
The
makeǯ ealain ill be made inciall baed n eǢ the
submitted code and data may be considered as evidence to check or
clarify statements made in the report.
Note: you will not be penalized in marks if you explore a reasonable question about the domain,
by looking at appropriate relationships between some aspects, and then conclude that there is
no clear relationship revealed.
Individual Task 1:
[Flawed]: States the name of the technique and answers, with valid justifications, one bullet point in their sub-task.
[Pass]: States the name of the technique and answers, with valid justifications, two bullet points in their sub-task.
[Distinction]:
States the name of the technique and answers, with valid
justifications, three bullet points in their sub-task.
[Full
marks]: States the name of the technique and answers, with valid
justifications, all four of the bullet points in their sub-task.
Individual Task 2:
[Flawed]: A chart of some data attribute.
[Pass]:
A chart of some data attribute, correctly documented encoding between
data attributes and visual attributes in each chart.
[Distinction]:
A chart of some data attribute, and correctly documented encoding and
other decisions (such as style of chart, scale etc), and sensible
justification of the choice of encoding in view of the effectiveness of
different visual attributes.
[Full marks]: A chart of some data
attribute, and correctly documented encoding and other decisions (such
as style of chart, scale, etc), and sensible justification of the choice
of encoding in view of the effectiveness of different visual
attributes, as well as sensible conclusions from the chart/statement of
the questions it raises for Project Stage 2B.
Group Task:
[Flawed]: An answer to ALL the group questions.
[Pass]: A well-reasoned answer to ALL the group questions, including a discussion of strengths and limitations.
[Distinction]:
A well-reasoned answer to ALL the group questions, including a
discussion of strengths and limitations in comparison to an alternative
for each question respectively.
[Full marks]: A well-reasoned
answer to ALL the group questions, including a discussion of strengths
and limitations in comparison to an alternative, and a justification of
your choice
Page 4 of 4
for each question respectively.
Penalties
10% of the overall mark will be deducted if your report is
unnecessarily longwinded and does not address the marking criteria
within the word limits.
Late Work As announced in the unit
outline, late work (without approved special consideration or other
arrangements) suffers a penalty of 5% of the maximum marks, for each
calendar day after the due date. No late work will be accepted more than
10 calendar days after the due date.