python代写-MGMTMSA 408|学霸联盟

python代写-MGMTMSA 408

时间：2021-05-03

MGMTMSA 408 – Operations Analytics
Homework 2 – Assortment Optimization and Newsvendor Problem (Question Sheet)
Due May 8, 2021 at 12:00pm PST
1 Designing a Sushi Menu
A high-end sushi restaurant is trying to decide what types of sushi to offer to its customers.
The restaurant surveys K = 500 of its customers to obtain ratings of n = 100 different types
of sushi. The ratings of the 500 customers for the 100 types of sushis are stored in the file
sushi utilities mat.csv (rows are sushis, columns are customers). In addition, the restaurant
also has additional information on the sushis: the file sushi info.csv contains the name of each
of the 100 sushi types (column 1), a category code for each sushi (column 2) and a price for each
sushi (column 3). You may also find it helpful to consult the file sushi description.txt which
includes the name and a short description of each sushi.
We will assume that customers choose according to a first-choice model of choice, and that the
utility of each option is the rating provided in sushi utilities mat.csv. Namely, each customer
considers the available options, evaluates the utility of each option, and then selects the option
that provides the highest utility. We will assume that the utility of the no-purchase option of each
customer type is 3. So for example, suppose we offer sushi #2, #7 and #12. Let’s fix customer
#7. Customer #7 evaluates the utility of each of those options:
u7,2 = 3.985 (utility of sushi #2 (= anago / sea eel))
u7,7 = 3.976 (utility of sushi #7 (= ikura / salmon roe))
u7,12 = 3.530 (utility of sushi #12 (= hotategai / scallop))
u7,0 = 3 (utility of no-purchase / outside option)
If these are the options, customer #7 will choose sushi #2 (sea eel) because this provides the highest
utility.
In our model of the customer choice behavior, we assume each customer selects his/her highest
utility option. We also assume that each customer has an equal weight/probability (1/K). There-
fore, if we offer the set S ⊆ {1, . . . , n}, then the predicted revenue is the average of the revenue
from the choice of each of the 500 customers: mathematically, this is
R(S) =
1
K
K∑
k=1
[∑
i∈S
ri · I{i is the first choice of customer k in S ∪ {0}}
]
where I{A} is the indicator function for the event A (it is 1 if A is true, and 0 if A is false) and the
first choice is that choice which provides the highest value of uk,i, i.e., it is arg maxi∈S∪{0} uk,i.
1
Part 1: Understanding the data
Before we start, let’s understand some patterns in the data. Load the file sushi utilities mat.csv.
a) What are the five most preferred sushis for customer 1? Hint : use the command argsort from
numpy on a particular row of the utility matrix.
b) What are the five least preferred sushis for customer 2?
c) For each customer, compute the rank of each of the 100 sushis according to their utilities. Which
sushis are the top five, i.e., the five with the best average rank over the 500 customers? What
do you notice about the sushi (specifically, the type of fish)?
Hint : Consider the following code snippet:
import numpy as np
temp = np.array([4.8, 2.3, 5.3, 3.9, 1.4, 5.1])
# temp contains the (hypothetical) utilities of 6 products
# Apply argsort once:
temp2 = np.argsort(temp) # What does temp2 contain?
temp3 = np.argsort(temp2) # What does temp3 contain?
d) Which sushi has the worst average rank over the 500 customers?
e) Which sushi is the most controversial, as measured by the standard deviation of its rank over
the 500 customers?
Part 2: Common-sense solutions
Load the data into Python. Create a function that computes the expected per-customer revenue
of an assortment of sushi items S, where each customer is picking the option that gives them the
highest utility. As a check, if we offer the set of sushis S = {1, 2, 3, 4, 5}, then the per-customer
revenue should be 14.2396. (Note: for some customers, it will be the case that all of the sushis
will have a utility uk,i lower than the no-purchase utility uk,0; these customers will never choose
any sushi we offer them. Customers #1, 2 and 5 are example of this. Despite these customers not
choosing anything, please continue to include them in the K of 500 for your revenue calculations.)
a) Suppose that we simply offer all sushi products, that is, we set S = {1, 2, . . . , n}. What is the
expected revenue in this case?
b) Suppose that we offer the ten highest revenue sushis, that is, we set S = {i1, . . . , i10}, where
ri1 ≥ ri2 ≥ · · · ≥ rin . Which are the ten highest revenue sushis? What is the expected revenue
in this case?
c) Suppose that for every customer k, we determine his/her most preferred sushi, i∗k = arg max1≤i≤n uk,i.
We then offer all of the most preferred sushis, that is, we set S = {i∗1, i∗2, . . . , i∗K}. (Note that
some sushis may be the top choice of more than one customer.)
What is the expected revenue in this case? What do you notice about this value?
d) Explain why the solution in (a) may be suboptimal.
e) Explain why the solution in (b) may be suboptimal.
Page 2
Part 3: An integer optimization model
Let’s see how to formulate the problem of selecting an optimal set of sushis as an integer optimization
problem.
Consider the following integer optimization model:
maximize
x,y
1
K
K∑
k=1
n∑
i=1
riyk,i (1a)
subject to
n∑
i=0
yk,i = 1, ∀ k ∈ {1, . . . , K}, (1b)
n∑
j=0
uk,jyk,j ≥ uk,ixi + uk,0(1− xi), ∀ k ∈ {1, . . . , K}, i ∈ {1, . . . , n}, (1c)
n∑
j=0
uk,jyk,j ≥ uk,0, ∀ k ∈ {1, . . . , K}, (1d)
yk,i ≤ xi, ∀ k ∈ {1, . . . , K}, i ∈ {1, . . . , n}, (1e)
xi ∈ {0, 1}, ∀ i ∈ {1, . . . , n}, (1f)
yk,i ∈ {0, 1}, ∀ k ∈ {1, . . . , K}, i ∈ {1, . . . , n}, (1g)
a) Explain how constraints (1b) – (1e) correctly model the preferences of each customer.
b) Implement the linear optimization relaxation of the above problem in Python. The relaxation is
obtained by relaxing x and y to be continuous decision variables as opposed to binary decision
variables – that is, we replace constraints (1f) and (1g) with:
0 ≤ xi ≤ 1, ∀ i ∈ {1, . . . , n}, (2)
0 ≤ yk,i ≤ 1, ∀ k ∈ {1, . . . , K}, i ∈ {1, . . . , n}. (3)
What is the optimal objective value of the relaxation?
c) A manager claims that it should be possible to obtain an assortment with an expected per-
customer revenue of $32. Explain why this is impossible based on your answer to (b).
d) Now, implement the integer version of the above problem (i.e., the binary constraints (1f) and
(1g) are enforced) in Python. Solve the problem. What is the expected per-customer revenue of
the optimal assortment? How much does this improve over the assortments from Part 2?
e) What is the optimal set of sushis the restaurant should offer?
Page 3
2 Hospital Staffing
A large hospital faces uncertainty in the number of daily emergency department (ED) cases. Each
ED case that occurs requires a single nurse. Ideally, the hospital would be able to handle such an
ED case with a “regular” nurse, who is compensated at $2160 per day. However, if the number of
ED cases exceeds the number of regular nurses scheduled, the hospital can call in agency nurses,
who are employed outside of the hospital, who are each compensated at $5400 per day.
The hospital needs to decide how many regular nurses to staff for each day, so as to minimize
its daily expected cost. In order to do so, a data set (nurse.csv) is available spanning 300 days
and with the following variables:
Variable Description
Day Number between 1 and 300 indicating the day
DailyED Number of ED cases on the given day
IsWeekend 0/1 variable to indicate whether the day is a weekend
IsHoliday 0/1 variable to indicate whether the day is a public holiday
DailyEDLag1 Number of ED cases one day ago
DailyEDLag2 Number of ED cases two days ago
TotalPriorSurgeries Total number of surgeries over the preceding week
Part 1: Formulating a Basic Newsvendor Model
Consider the problem of deciding a nurse staffing level without incorporating the contextual infor-
mation.
a) As a warmup, suppose that on a given day, there are 12 ED cases, and we have staffed 18 regular
nurses. What would be the staffing cost incurred by the hospital?
b) Now, suppose that there are 8 ED cases, and we have staffed 5 regular nurses. What would be
the staffing cost incurred by the hospital in this case?
c) Formulate the problem of deciding the daily regular nurse staffing level so as to minimize daily
expected cost as a cost-based newsvendor problem. What does the “order quantity” Q cor-
respond to? What does the “demand” D correspond to? What are the overage cost co and
underage cost cu?
Part 2: Solving the Basic Newsvendor Model
For this part of the problem, we will use the data to determine a context-free nurse staffing level,
i.e., a staffing level that does not incorporate the contextual information. Split the data into a
training set and a test set, so that the training set consists of the first 200 days of data, and the
test set consists of the last 100 days of data.
a) Based on the overage and underage costs from Part 1, what quantile (i.e., a probability between
0 and 1) of the empirical distribution of DailyED should the optimal staffing level correspond
to?
b) Based on the empirical distribution of DailyED in the training set, determine the optimal staffing
level. You may round your answer to the nearest integer, if necessary.
Page 4
c) Based on the training set, what is the average cost, where the average is taken over the 200 days
in the training set, that this staffing level would incur?
d) Based on the test set, what is the average cost, where the average is taken over the 100 days in
the test set, that this staffing level would incur?
Part 3: Solving the Contextual Newsvendor Model
Using the same training and test set as Part 2, we will now develop a contextual newsvendor model.
Using the training set, build a regression tree using the sklearn.tree.DecisionTreeRegressor
function to predict DailyED. Be sure to omit the variable Day from your independent variables. Set
the maximum depth parameter (max depth) of your tree to 2.
a) What variables does your tree split on?
b) For each leaf of the tree, calculate the optimal staffing level (rounded to the nearest integer)
for the corresponding conditional distribution of DailyED given the training set. What is the
optimal staffing level for each leaf of your tree?
c) Based on the training set, what is the average cost, where the average is taken over the 200 days
in the training set, that this staffing rule would incur?
d) Based on the test set, what is the average cost, where the average is taken over the 100 days in
the test set, that this staffing rule would incur?
Page 5

学霸联盟