ACF5320 – Semester 1, 2025 – Assignment 2 | 1 +- ASSESSMENT TASK: Assignment 2 WEIGHTING: 30% COMPLETION: Individual GENERATIVE AI: Generative AI tools can be used in this assessment task In this assessment, you can use generative artificial intelligence (AI) to generate the specified content in relation to the assessment task. This material must be acknowledged and recorded in your declaration of AI use. DUE DATE: 11:55pm, Wednesday, 9 April 2025 OVERVIEW In this assignment, you are tasked with conducting regression analysis on multiple datasets provided in Excel format. The assignment is structured around four key cases, each requiring you to apply regression techniques to predict outcomes based on various independent variables. This exercise aims to assess your proficiency in predictive modelling, data analysis, and the interpretation of results within a business analytics context. • In the Decision case, using the "Decision.xlsx" dataset, you will analyse the impact of experience on decision-making quality among auditors, examining how it correlates with intelligence, thinking styles, and personality traits. • The Haircut case requires you to explore the "Haircut.xlsx" database to determine the factors that significantly influence a company's revenue, employing regression analysis to identify these key predictors. • For the Audit scenario, with the "Audit.xlsx" dataset, you are to investigate the relationship between audit delay and various descriptive variables, focusing on developing a regression model that can accurately predict delay durations. • The Prescription Cost Analysis involves the "Prescription.xlsx" dataset, where you will model and predict drug costs based on a set of independent variables, enhancing your model's accuracy through iterative refinement. Your submission should demonstrate a thorough understanding of regression analysis as applied to predictive analytics. This includes not only the technical execution of statistical tests but also the ability to interpret and communicate the significance of your findings in a clear, concise manner. Through this assignment, you will showcase your capability to leverage Excel for predictive modelling and to derive actionable insights from complex datasets. OBJECTIVES • Understand and apply regression analysis techniques. • Analyse relationships between dependent and independent variables. • Interpret and evaluate regression model outputs. • Develop predictive models based on the analysis. • Communicate analytical findings effectively. ACF5320 – Semester 1, 2025 – Assignment 2 | 2 SUBMISSION REQUIREMENTS Type your responses in a MS Word document and submit your Word document to Moodle. Cut and paste any relevant output from Excel into your Word document. You do not need to clean the data and do not delete any data. Case 1: Decision (10 marks) Using the “Decision.xlsx” dataset, analyse differences between experienced and inexperienced participants. (1.1) Do the experienced versus the inexperienced auditors differ in the quality of their decisions (i.e., the Decision variable)? Cut and paste relevant statistics from Excel and explain the statistics. (4 marks) (1.2) Do the experienced versus the inexperienced differ in terms of any intelligence, thinking style, or personality trait variables? Identify the ones that are different and provide the relevant statistics. Cut and paste relevant statistics from Excel and explain the statistics (only for those that are different). (4 marks) (1.3) Without using the language of statistics, what do you conclude about experienced versus inexperienced auditors? (2 mark) Decision data description Participants consist of auditors and students. Auditors are considered experienced and students are inexperienced. Variable Definition ID Participant identification number. Decision Higher values indicate better performance on task requiring professional judgment. WPT Number of questions correctly answered on the Wonderlic Personnel Test. An IQ test. Higher scores indicate higher IQs. FFM_agree Response to the measures of the agreeableness factor in the Five Factor Model. FFM_cons Response to the measures of the conscientiousness factor in the Five Factor Model. FFM_ES Response to the measures of the emotional stability factor in the Five Factor Model. FFM_extra Response to the measures of the extraversion factor in the Five Factor Model. FFM_open Response to the measures of the openness factor in the Five Factor Model. Exp dummy 0 = inexperienced, 1= experienced ACF5320 – Semester 1, 2025 – Assignment 2 | 3 Case 2: Haircut (5 marks) Use the “Haircut.xlsx” database to run regression models that explain the factors that significantly influence revenue at this company. (2.1) Report and interpret your best model’s technical details. Cut and paste the relevant statistics from Excel and explain the statistics. (2 marks) (2.2) Do you believe that your model is effective for explaining changes in revenue? Explain and justify your response. (2 marks) (2.3) Explain in plain language the meaning of your findings. (1 mark) Haircut data description You have been provided an Excel file that contains 4 data items. Each row represents the data for one haircut at a business that operates in two countries. The business does not take appointments. Customers walk in and wait for a haircut. Variable Definition Wait_time the number of minutes the customer waited for the hair cut Chair_time the number of minutes needed to complete the hair cut Revenue revenue generated from the hair cut Labour_cost cost of labor for the hair cut Country dummy variable for country 1 and country 2 ACF5320 – Semester 1, 2025 – Assignment 2 | 4 Case 3: Prescription Cost Analysis (15 marks) Assume that you are working for a government agency that is trying to determine the main causes of different drug costs for different patients. You have data (“Prescription.xlsx”) from six months of drug prescriptions. You need to model and predict drug costs. The appendix shows descriptions of the data. (4.1) Assume that we are using this model: (3 marks) GrossDrugCost = B0 + B1 * RiskScore + ε i. Interpret the coefficient and the p-value for the RiskScore variable. Provide a practical explanation of the RiskScore variable for senior management. (1 mark) ii. Explain what R-squared means in a statistical way and provide a practical explanation of the information to senior management. (1 mark) iii. A coworker wants to know what the predicted gross drug costs would be for a new member. The new member is a 73-year-old man who the government classifies as frail and he has a risk score of 510. Using the model above, what would you predict the gross drug costs will be? (1 mark) (4.2) Assume we are using this model: (8 marks) GrossDrugCost = B0 + B1 * Risk Score + B2 * Age + B3 * Gender + ε iv. Provide a statistical interpretation of the coefficient and p-value for the gender variable. Provide a practical explanation of the information to senior management. (1 mark) v. Provide a statistical interpretation of the coefficient and p-value for the age variable. Provide a practical explanation of the information for senior management. (1 mark) vi. Provide a statistical interpretation of this model’s intercept. Provide a practical explanation of the information to senior management. (1 mark) vii. Compare the adjusted R-squared values between Models 1 and 2. Are they the same or different? Why? What could you conclude about the differences (if any) in the adjusted R- squared values? (2 marks) viii. Senior management wants to know the expected gross drug costs of the average customer. That is, for the median value of the RiskScore, age and gender, what would you expect the average gross drug costs to be? (2 marks) ix. A coworker wants to know what the predicted gross drug costs would be for a new member. The new member is a 73-year-old who the government classifies as frail and he has a risk score of 510. Using the model above, what would you predict the gross drug costs will be if they were a man and if they were a woman? (1 mark) (4.3) Create a better model (4 marks) x. Develop a better regression model to predict gross drug costs. (2 marks) xi. What did you learn from this model that previous models did not tell you? (2 marks) ACF5320 – Semester 1, 2025 – Assignment 2 | 5 Variables Definition RecordID Primary key from the database that is a unique number for each row of MemberID; A unique ID for each different member Month The month to which the data pertains, listed in numeric format as 1 for January, 2 for February, etc. GrossDrugCost The total amount of drug costs incurred by a member during the corresponding month NLISDummy A dummy variable that takes the value of 1 if the member is listed as non-low income by the government and 0 otherwise LISCHOSERDummy A dummy variable that takes the value of 1 if the member chose a specific plan and 0 if the member automatically was assigned a plan, i.e., members automatically are assigned (thus, LISCHOSERDummy RiskScore A score assigned by the government based on previous government data indicating how sick someone is, higher scores indicate members are sicker SpecialtyDummy A dummy variable that takes the value of 1 if the member utilizes specialty drugs and 0 otherwise AdjudicationDays The number of non-holiday workdays in a month Age Gender A dummy variable that takes the value of 1 if the member is female and 0 if the member is male FrailtyDummy A dummy variable that takes the value of 1 if the government indicates the member is frail and 0 if the government indicates the member is not frail HospiceDummy A dummy variable that takes the value of 1 if the member is receiving hospice care and 0 if they are not InstitutionDummy A dummy variable that takes the value of 1 if the member is receiving institutionalized long-term care (e.g., hospital, nursing facility) and 0 if they are not ESRDDummy A dummy variable that takes the value of 1 if the member is receiving care for end-stage renal disease (i.e., end-stage kidney disease) and 0 if they are not SUBMISSION DOCUMENT MS Word file with the answers to all assignment questions supported by screenshots from Excel output (where relevant). The submitted file should contain student’s Name, Surname, and Student ID.
学霸联盟