MXN600-无代写|学霸联盟

MXN600-无代写

时间：2023-09-07

This document describes the background context and tasks for your first MXN600 Project. This project is
intended to give you experience in the application of Generalised Linear Models (GLMs) to answer realistic
questions that arise in industry. For further information see the assessment marking criteria on Canvas.
Scenario
You are a data analyst working for a global retail company specialising in luxury goods. The company employs
over 10,000 sales staff across a number of stores internationally in Asia, Europe, South-America, and the
United States. Business is very good, however, the company is looking to improve their sales of big ticket
luxury items (defined as single item sales exceeding $10,000.00) with a view to expand their business more
into this space. Due to record profits in the previous financial year, your CEO would like to re-invest a
substantial portion of this profit toward this expansion plan. However, there is some differences of opinion
among the board of directors on how the money should be invested.
Some believe that the older stores are not attaching the right clientele for big ticket sales. They blame the
outdated style and display layouts of these stores. Therefore, one proposal is to upgrade all of the older stores
to represent the more modern look and feel of their newer stores.
An alternative proposal has also been developed based on the anecdotes from store managers that the
experience and training level of the sales team staff contributes more to big ticket sales than the style or layout
of the store. The proposal is to redirect the funding into specialist training for staff on big ticket sales.
Your CEO has requested you perform some analysis on big ticket sales data to help inform the board’s final
funding decision. Specifically, your CEO would like to know:
1. Of the four major style/layouts across your stores, which one would you recommend become the
international standard for your company, based solely on big ticket item sales performance in the past
year?
2. Does the available data support the assertion that the staff experience level with big ticket sales is more
important than store layout?
Within your company, there is also interest in exploring the use of generative AI tools to support their data
analytics team. As a result, you have been given permission to used any generative AI tool that you wish to
complete your task. However, you are not permitted to share any of the project files directly with these tools
due to privacy concerns.
See the Academic Integrity section for more details on using generative AI in this project.
The Data
You have obtained a CSV file called salesdata.csv (see Canvas). It contains counts of big ticket sales and
sales staff hours worked aggregated by the experience level of the sales staff and the store style/layout. The
data are for the last financial year of trading.
Specifically the variables are:
Sales - count of big ticket sales in the group
Layout - the store style/layout for the group (Layout 1 is the newest style/layout option and 4 is the
oldest).
Hours - total hours worked by the sales staff in this group
Experience - the experience level of the sales staff in this group
Tasks
Task 1: Statistical analysis (70 marks total)
Conduct a regression analysis using a generalised linear model for the sales counts. Motivate this analysis
using your CEO’s queries. Draw conclusions that clearly address the queries. Document and develop your
analysis in a single Rmarkdown document. The audience of this document is another data analyst, so you
should clearly outline the questions being addressed, the methods applied and the conclusions drawn, with
every step/decision being justified.
Base all of your conclusions on a single fitted generalised linear model. Validate the assumptions of the model
you have found, including a fixed or estimated overdispersion parameter based on the mean-variance
relationship of your observations. Appropriately assess the fit of your model.
Task 2: Summary on a Page (SOAP) (30 marks
total)
Produce a 1 page summary for your CEO that addresses their two queries directly. This must include at least
one plot. Utilise graphics to make your points clear wherever possible. Some considerations:
Nominate the methods used but do not describe them in detail.
Base your assertions and recommendations on evidence from your analysis.
Do not present the effect of a covariate without communicating the uncertainty around that effect. State
confidence intervals and show confidence bounds on plots.
Be concise. Dot points are appropriate.
This is not the work, it is the communication of your work in Task 1 for a non-expert. You could also think
of this like an advertisement of your work. In the real world, people are unlikely to look at the work if the
advertisement isn’t clear and engaging.
I would encourage you to use Rmarkdown for this document, however, docx, html and pdf are also acceptable.
Submission
Submission of this assessment with be electronic via Canvas. Please note that this assessment item is
due at 11.59pm Friday, Week 8, 2023. The standard 48 hour extension applies to this assessment.
Submission Format
Please submit a single compressed .zip file. Keeping your submission neat and tidy will assist in grading.
Create a README.txt file if you need to communicate specific usage instructions. Ideally your repository will
contain only:
1. Your analysis in Rmarkdown form. e.g. Sales_Analysis_2023.Rmd
2. The kitted version of the Rmarkdown file. e.g., Sales_Analysis_2023.html or Sales_Analysis.pdf
3. Your SOAP e.g. sales_SOAP_2023.docx or sales_SOAP_2023.pdf etc.
4. The data file, salesdata.csv , which you got from Canvas.
5. Your README.txt file. (Optional)
Note: The Rmarkdown file should be reproducible and complete. That is, your grader should be able to knit
your *.Rmd file to reproduce the knitted document (either *.html or *.pdf ) that you provided. This means
that your final submission code must not include any generative AI API calls, even if you used them in the
development of your final solution (See Academic Integrity section for more on usage of generative AI).
Academic Integrity
The tasks in this project are to be completed individually, you must not share your working, solutions, or
codes with your peers (this includes via the slack channel). Your submission must be your own work. You are
not permitted to copy, summarise, or paraphrase the work of others in you submission. Please see QUT’s
Academic Integrity Guidelines on Canvas.
Use of Generative AI
In this project, you are authorised to make use of generative AI technologies (such as ChatGPT) if you wish
to. However, you are completely responsible for the correctness and quality of the code, the analysis, and any
conclusions you draw.
However, you must not upload the data salesdata.csv , nor this project description document
MXN600-Minor-Project-Description-2023.pdf nor any direct verbatim portion thereof.
In additional, if you choose to use generative AI, then there are two extra documents you must submit with the
main submission files described above (see Submission Format). These extra files are:
1. A completed and signed declaration form (available on Canvas) that indicates the AI system used, and
for which parts of the project it was used. The form includes a declaration you must sign to ensure you
understand that you are responsible for the correctness and quality of the work produced by the AI.
2. A document that captures the prompts and output dialogue between you and the AI. If you are using API
calls to interact with the Generative AI system, please include the code you used (but please remove
your API Keys etc). Also discuss how you needed to modify the AI output to make it useful.
Note: If you have access to and decide to use a paid version of a generative AI tool, you are responsible for
any costs incurred.
Some things to consider in relation to your use of AI:
1. Your choice to use generative AI is not considered when assessing your grades. You grades will be
determined based on the correctness and quality of the work alone.
2. The choice to use generative AI should be based on how it supports your learning.
3. Be aware of the limitations of generative AI tools, they do not have a concept of truth in their responses,
they can be biased and can introduce error. Please proof read and edit text responses and carefully test
any code responses from these tools.
4. There are many unresolved ethical and legal questions still unfolding with regard to this technology. As
data scientists/analysts, we should be aware of these issues. To name a few:
i. the carbon-footprint of training/running these systems is very high;
ii. the low-paid human workers evaluate/label outputs for supervised model training;
iii. there are substantial challenges of implementing safety guardrails or alignment;
iv. there are many questions around the use of copyrighted material without
consent/attribution/commission to the author.
v. there are also a number of privacy concerns around the use of user data in some of these
systems.