DTSC71-200-无代写-Assignment 3
时间:2023-11-30
Data Science DTSC71-200 Semester 233
Assignment 3 – Power BI Dashboard (20% grade)
Due: 8/12/2023
Assignment description
Your task in this assignment is to use Power BI and R to produce a dashboard of the three classifiers (models)
you have already completed in assignments 1 and 2. The dashboard should be capable of allowing the user
to enter new data, and to classify it using a Decision Tree, a Logistic Regression, a KNN and a K-means
model. The answer should be one of the classes (<=50K or >50K), and optionally the probability of the
classification (if the model makes the probability available after the prediction), or an indication to which
cluster it belongs to.
Deliverables: One Power BI file (.pbix) for the dashboard and supplementary files if needed (e.g., R
scripts, data etc).
Minimum content and functionality for the dashboard:
- 4 models, each one on its own page, and
- An extra page with the summary of the models presenting performance on the test data.
o The performance metrics should be presented and calculated with the test data, and
loaded into the dashboard (not manually written in a table).
o These could be one of: Accuracy, precision, recall, confusion matrices, and ROC curves.
- a prediction element where users can enter their data and receive a prediction back.
- In the case of clustering, to locate the new sample in the cluster space and indicate which group
that sample would belong to.
Tips
- Show visualisations of the models (e.g., tree plotted on the side).
- The dashboard should have a different page for each model.
- The dashboard should allow users to make predictions by entering data (either typed or via
elements such as sliders).
- Data may need transformation (e.g., if the KNN or K-means were trained with normalised data).
- Use the example in the workshop to get started.
- Loading models is better than building them in the Power BI dashboard (more details during the
workshop). However, in some cases, we do need to train the models dynamically in the dashboard.
Rubric
High Distinction
(>=85%)
Distinction
(75~84%)
Credit
(65~74%)
Pass
(50~64%)
Fail
(<50%)
Weight
Presentation
and
Design
1. Outstanding design and tidy
elements.
2. Tidy and relevant labelling on all
pages.
3. Multiple pages for the models,
with an outstanding summary.
4. A complete explanation of how
to use the model is included in the
dashboard (on each page where
models are presented).
1. Excellent design, tidy
elements (no overlapping)
2. Labelling on all pages.
3. Multiple pages for the
models and a summary.
4. A brief explanation of
how to use the model is
included in the dashboard
(on each page where models
are presented).
1. Good design with tidy
elements (no overlapping).
2. Some labelling on all
pages.
3. More than one model on
one page or the summary is
mixed with models.
4. An explanation is included
but does not reflect how the
dashboard is used.
1. Few overlaps, the
design is not very clear
for the user.
2. Some labelling on
most pages, some
missing.
3. More than one
model on one page or
no summary.
4. An explanation is
included but not on all
the dashboard pages.
1. Elements overlap,
confusing design.
2. No labels to help
users.
3. Not all models
are presented, and
no summary is
presented.
4. No explanation is
included.
40%
Technical
aspects
1. User inputs are appropriately
transformed before predictions are
computed. (E.g., KNN may need
scaling or normalisation).
2. Reported performance in the
summary is based on the test set,
available within the dashboard.
3. Predictions for individual
samples are clearly shown to
belong to a class for all models.
4. All 4 models were included in
the dashboard (Decision tree, LG
and KNN), complete with
predictions.
5. Multiple metrics are included in
the summary for all models.
6. The code is organised with
appropriate comments highlighting
the purpose of each chunk.
1. Some user inputs are
transformed before
predictions are computed.
2. Reported performance in
the summary is based on the
test set, with some manually
entered.
3. Predictions for individual
samples are clearly shown
to belong to a class for most
models.
4. The 4 models were
included in the dashboard
(Decision tree, LG and
KNN), one incomplete.
5. Multiple metrics are
partially included in the
summary.
6. The code is organised
and commented on.
1. Some user inputs are
transformed, but some lead to
incorrect predictions.
2. All the reported
performance is entered
manually in the summary.
3. Predictions are not clearly
indicated, or only the
probability is shown for all
models.
4. The 4 models were
included in the dashboard,
one incomplete.
5. A Single metric for each
model is included in the
summary.
6. The code is mostly
organised with some chunks
commented on.
1. User inputs are not
transformed, giving the
incorrect predictions.
2. Only some models
have reported
performance.
3. Predictions are not
clearly indicated or
does not work with all
models.
4. The 4 models were
included in the
dashboard, but all are
incomplete.
5. Not all models have
metrics in the
summary.
6. The code has some
unclear chunks, with
few comments.
1. Users cannot
input new data.
2. No summary or
performance report.
3. No predictions
are made, or
entering different
data does not
change the
prediction in the
dashboard.
4. No models were
implemented in the
dashboard.
5. Missing metrics
or no summary
presented.
6. Code not well
organised, no
comments.
60%