INFS5720-无代写|学霸联盟

INFS5720-无代写

时间：2023-11-06

INFS5720
Business Analytics Methods
Team Assignment
Term 3, 2023
2
This team assignment covers Week 1 to 7 lectures and tutorials (Week 6 is mid-term break).
It accounts for 30% of the final grade for INFS5720 Business Analytics Methods. The
deadline is 10 Nov 2023, 1500hrs. Do not wait till last minute. Late submissions due to
Internet problem will still incur penalty.
UNSW has a standard late submission penalty of:
▪ 5% per day
▪ capped at five days (120 hours) from the assessment deadline, after which a student
cannot submit an assessment
▪ no permitted variation
One person in each team shall submit two files to Moodle folder, Left menu > Assessments
Hub > Team Assignment Submission.
- .pptx file (PLUS .pdf file if you want to ensure the layout stays the same)
- .ipynb file
Details of pptx:
▪ File name:“Your team name Team Assignment.pptx”
▪ Name of presenter for each slide must be shown on that slide
▪ Page 1 (cover page) should contain Subject Code “INFS5720”, Title “Team Assignment”,
Members’ names and zIDs
▪ Page 2 should follow the below format. This page is obviously not there when you record
the video. Add this page after recording.
YouTube Video Link: https://www.youtube.com/...
Name Contribution Time in video
Joe e.g., Logistic Regression
part, more description…
e.g., 5m30s – 7m10s
9m30s – 10m50s (if Joe shows up twice)
…
▪ The main content starts from Page 3
Details of ipynb file:
- File name:“ Your team name Team Assignment.ipynb”
- Show each code cell’s output (e.g., accuracy score, decision tree image) for teaching
team’s easy reference, i.e., do NOT clear each cell’s outputs
- Write observations and explanations in markdown cells to make it a report-style file
Details of video:
- Upload to YouTube as an UNLISTED video, so that it is only accessible via the link
- Place the link on Page 2 of the pptx file
- Name of presenter for each slide must be shown on that slide
- DO NOT READ FROM A SCRIPT
- Max 15 minutes
- Each member must present a part
- Easiest way is to use zoom or MS Teams to record
- The presenter’s face shall be on the right side of the slide, to avoid blocking the slide,
as shown in the below sample
3
Data sets and sample code
You can choose ONE (1) of the provided data sets with sample code. Usually, the authors
cover a few models LogisticRegression, DecisionTreeClassifier etc with default parameters
of each model. They would also only do a brief comparison on the models’ accuracy rate or
SSR. Moreover, they mostly did not look at the models from a business point of view.
Data set 1: Credit agency - company bankruptcy prediction
Data file:
https://www.kaggle.com/datasets/fedesoriano/company-bankruptcy-prediction
Sample code by SANJOY MONDAL:
https://www.kaggle.com/code/sanjoymondal0/company-bankruptcy-prediction-acc-97
Data set 2: Retail bank - predicting churn for bank customers
Data file:
https://www.kaggle.com/datasets/adammaus/predicting-churn-for-bank-customers
Sample code by AHMET CAN KARAOĞLAN:
https://www.kaggle.com/code/ahmetcankaraolan/churn-prediction-using-machine-learning
Data set 3: Equity trading - predicting stock closing price
Data file: no data file. Price data of any Stock Ticker can be read from Yahoo directly.
Sample code by AKSHAY SHARMA:
https://www.kaggle.com/code/akshaysharma001/predicting-stock-closing-price-99-
accuracy/notebook
Data set 4: Diamond trading - diamond price prediction
Data file: https://www.kaggle.com/datasets/shivam2503/diamonds
Sample code by SURAJ JHA:
https://www.kaggle.com/code/surajjha101/regression-models-diamond-price-
prediction/notebook
Download .ipynb code file from the authors by clicking the three dots, and “Download code”.
Make sure the .ipynb file is in the same folder as the data files.
4
Hypothetical context
You are a group of new analysts in an Analytics team supporting the business operations with
Business Analytics methods. The sample code is from a smart colleague who just left your
team. The team manager Gloria passed you the data set and the code written by that person.
Gloria expects all new hires to enhance the work together, i.e., improve the models, either in
accuracy, or in business interpretability, or in ability to generalize to future data. Things that
you can try can include (but not limited to):
- experimenting with different parameter settings, such as applying regularization to
LogisticRegression, or max tree depth of Decision Tree
- experimenting different methodology, such as train-validation-test, or k-fold cross
validation
- feature scaling
Presentation
Imagine that it is a LIVE presentation instead of a video recording. Two people will sit in:
- Gloria, Head of Analytics team, who has technical knowledge of Business Analytics
- John, a stakeholder from the business team, who will take your model and findings
to his regular business meetings
Hence in the presentation, you need to
- showcase technical ability to Gloria by discussing technical details on how you
improve the models (only very important code shall be shown on slides)
- make a substantial part of the presentation useful for John, e.g.
o generate business insights or implications from the models
o use business knowledge to convince John that your model reflects the reality
and is useful for their day-to-day operations
Note: Data visualization and plotting are NOT the focus of this course. But if you want to use
some plots to showcase your models’ good performance, feel free to do so.
Mandatory models
Below are the mandatory models for each data set. If some mandatory models are not
implemented in the code given, you need to implement them on your own. Once all
mandatory models are working, you can try to improve each model. It is possible that one
5
model cannot be improved much after trying all possible ways. In that case, present to Gloria
what you have tried and show her that the performance cannot get much better.
Refer to official documentation of sklearn and statsmodels for model parameters and
examples.
Dataset 1 & 2: classification
- Logistic Regression
- Decision Tree
- Neural network
Dataset 3 & 4: regression
- Linear Regression
- Decision Tree
- Neural network
Beyond the mandatory models
You can include any additional models, to impress Gloria and John. You are NOT restricted
to sklearn and statsmodels.
The Jupyter Notebook .ipynb file will be submitted to Gloria for her review. Hence do
include all observations and explanations in markdown cells to make it a report-style file.
Closing notes
This team assignment is rather open-ended. The team can decide on the approach to improve
the models and build your own story to support the claim that the models have been improved.
---- END OF TEAM ASSIGNMENT ----