PM410-customer analytics代写|学霸联盟

PM410-customer analytics代写

时间：2023-04-15

4/11/23, 5:38 PM 410 Customer Analytics
https://internal.anderson.ucla.edu/faculty/anand.bodapati/410/product-optimization-exercise.htm 1/9
Product Optimization Exercise:
Because this HW is challenging, there are two policy changes as listed below These apply
to this HW only and not to future homeworks unless declared by me explicitly for those
future homeworks.
You can code in either R or Python. However, I strongly encourage you to do it in
Python because Python is the language favored in most jobs for MSBA students and
it is important to me that you get as much job-oriented preparation as possible in
my course. To incentivize you to work in Python, I will give you bonus points added
to your class participation score if you do this HW in Python rather than R.
HW collaboration is allowed to a limited extent as described here: You may discuss
this HW, solution approaches and Python/R function usage with your classmates.
Joint on-screen viewing of code and results can be done to the extent it helps a
student who is stuck to get unstuck just enough to proceed with the HW on his/her
own. However, there is to be no sharing/transmission of program code or any
written submission because that detracts from learning. Such sharing is a violation
of UCLA's Code of Conduct.
You can use generative AI tools (like ChatGPT, Bing Chat, Bard, GitHub Copilot). If
you do use these, then please be sure to submit details on what you did via Optional
Task 4 so that you get extra credit.
Now to the problem statement for this HW: You are producing beverage mugs and are
trying to identify the best price-feature-vector. Assume the following attributes and
attribute levels:
Price: $30, $10, $5
Time Insulated: 0.5 hrs, 1 hrs, 3 hrs
Capacity: 12 oz, 20 oz, 32 oz
Cleanability: Difficult (7 min), Fair (5 min), Easy (2 min)
Containment: Slosh resistant, Spill resistant, Leak resistant
Brand: A, B, C
Assume the following as the "proposed market scenario", ie the scenario with the current
competitors and our proposed candidate.
Incumbents
1: $30, 3 hrs, 20 oz, Clean Easy, Leak Resistant, Brand A
2: $10, 1 hrs, 20 oz, Clean Fair, Spill Resistant, Brand B
Our proposed candidate
3: $ 30, 1 hrs, 20 oz, Clean Easy, Leak Resistant, Brand C
Assume the following cost structure:
Time Insulated: 0.5 hrs costs $0.5, 1 hrs costs $1, 3 hrs costs $3
Capacity: 12 oz costs $1.00, 20 oz costs $2.6, 32 oz costs $2.8
Cleanability: Difficult (7 min) costs $1, Fair (5 min) costs $2.2, Easy (2 min) costs $3.0
Containment: Slosh resistant costs $0.5, Spill resistant costs $0.8, Leak resistant costs
$1
You are given data on the preference parameters of 311 consumers in this file: mugs-
preference-parameters-full.xlsx. The CSV version of this file is here: mugs-preference-
parameters-full.csv

BE
4/11/23, 5:38 PM 410 Customer Analytics
https://internal.anderson.ucla.edu/faculty/anand.bodapati/410/product-optimization-exercise.htm 2/9
Question 1: Using the compensatory rule with logit adjustment: Compute and report
our candidate's share, cost, margin and expected profit per person under the "proposed
market scenario" given above. This question is a strict subset of the next question. If you
have done the next question successfully, then all you need to do for this question is
report the numbers for Product Candidate 45 which corresponds to our proposed
candidate in the "proposed market scenario" given above. Even though this question is a
strict subset of the next question, I am asking for it separately. This is because not all
students may be able to do the next question successfully as it is a larger and more
general computation. Hint to check your answer: The number you should get for
expected profit per person is between 4 and 4.5.
Question 2: Discrete Optimization: Consider each of the three levels for each of the five
attributes and enumerate all the possibilities in lexical order with Price as the leftmost
attribute changing slowest and taking levels sequentially $30, $10, $5, then Time
Insulated as the left-second-most attribute changing second-slowest and taking values
sequentially 0.5 hrs, 1 hrs, 3 hrs and so on. You will have 243 product candidates. (The
lexical order produces indices as shown in this file. Please make sure you list your
products in that exact same order. The lexical order of products is obtained by looping
as shown in the following R code, which has close similarity in python: mugs-products-
lexical-order-loop.R.) Again using the compensatory rule, compute and report the
following four columns of numbers for each candidate: the share, the cost,the margin
and the expected profit per person (all under the current competition incumbents). The
table that you submit needs to have all 243 rows. Hint to check your answer: product
candidate 230 has a negative expected profit per customer of between -1.75 and -1.85,
and product candidate 106 has an expected profit per customer of between 0.7 and 0.8 .
Question 3: Using the 243-row table you produced in the previous step: Identify the
product with the highest expected profit per person. For this optimal product, list the
values of the five attributes and its share, cost , margin and expected profit per person.
Question 4. According to the pure algorithmic analytical approach, the best product
is the one with the highest expected profit per person (abbreviated here as "EPPP").
However, depending on the objectives of the company, your product manager boss may
not agree that this is the best product. He/she may argue you need to consider, as
primary criterion, one or more of the remaining three metrics: Share or Cost or Margin or
Revenue.
1. What would be the business rationale to launch the product with the highest
market share (instead of the product with the highest EPPP)? Respond in 1-5
sentences. In the 243-row table, identify the product with the highest market
share and give the values of the five attributes. To answer this question it may
help to plot the 243 products on EPPP versus share like is shown on Product-
Price Optimization Slide 19.
2. What would be the business rationale to launch the product with the highest
margin (instead of the product with the highest EPPP)? Respond in 1-5
sentences. In the 243-row table, identify the product(s) with the highest
margin and give the values of the five attributes.
3. What would be the business rationale to launch the product with the lowest
cost (instead of the product with the highest EPPP)? Respond in 1-5
33DEEEEets
4/11/23, 5:38 PM 410 Customer Analytics
https://internal.anderson.ucla.edu/faculty/anand.bodapati/410/product-optimization-exercise.htm 3/9
sentences. In the 243-row table, identify the product(s) with the lowest cost
and give the values of the five attributes.
4. What would be the business rationale to launch the product with the highest
revenue per person in the market (instead of the product with the highest
EPPP)? Respond in 1-5 sentences. In the 243-row table, identify the product(s)
with the highest revenue per person in the market and give the values of the
five attributes. Note that revenue per person in the market can be computed as
share times price.
The above questions involve compensatory analysis with logit adjustment, for which you
should use a scaling constant value of c=0.0139.
Hints and explanatory notes for Q1 and Q2
Q1 and Q2 involve following steps analogous to the steps we followed in class in Product-
Price Optimization slides 2 through 18. The excel files for the in-class exercise are
here: mugs-preference-parameters-limited.xlsx and mugs-analysis-limited.xlsx. The
first file gives the starting point with the preference data, the second file gives the ending
point after the analysis has been completed.
Obviously, to do this HW, one first needs a thorough understanding of the file "mugs-
analysis-limited.xlsx" and how it is executing the calculations described on slides 2
through 17. This is something that each student will need to do on his/her own and this
will require fair competence in Excel. Here are some comments that may help you
understand the structure of the "mugs-analysis-limited.xlsx" spreadsheet. Remember
that to calculate the profit of a candidate product we need four critical inputs: (1) A
description of the candidate product and the incumbent products, (2) the preference
parameters, (3) the cost structure, and the dollar values for each price level, and (4) the
"c" value for the logit probability formula. These four inputs appear in pale green color in
cell blocks B2:D5, A20:Q30, P2:P10 and M1 respectively. These are the only inputs to the
model. Every other cell in the spreadsheet is a computation that depends on these four
cell blocks. Let us walk through these cell blocks now:
1. Cell block B2:D5. This gives the level number for each of the four attributes for each
of the three products. The level number indicates which level a certain attribute
takes. The levels are given on Slide 3. The level number for each level is just the
numerical position that level appears in on Slide 3. Therefore, for Price, level=1
means $30, level=2 means $10, level=3 means $5. By similar logic, for Capacity,
level=1 means 12oz, level=2 means 20oz, level=3 means 32oz. The similar logic
applies for Cleanability and Brand as well. Consider the numbers in cells B2:B5 for
Prod1. For Product 1, we see from Slide 3 price=$30, this corresponds to level 1,
and so we enter 1 in cell B2. Continuing with Product 1, it has Capacity=20 oz,
Cleanability=Easy and Brand=A, which correspond to level numbers 2, 3 and 1
respectively and these are the numbers we enter into cells B3, B4 and B5
respectively. Similarly, we enter the level numbers for the attributes for the other
two products.
2. Cell block A20:Q30. This just given the importances for the various attributes and
the preference measures for each attribute level for each attribute. These are the
raw data given to us.
3. Cell block P2:P10. In cells P2:P4 we enter the dollar value each price level number
corresponds to. In cells P5:P10, we enter the cost contribution for each attribute
t
4/11/23, 5:38 PM 410 Customer Analytics
https://internal.anderson.ucla.edu/faculty/anand.bodapati/410/product-optimization-exercise.htm 4/9
level in every attribute other than price or brand (neither of these two contribute
directly to manufacturing cost).
4. Cell block M1. Here we enter the "c" value, which we computed on Slide 14.
Given the above four sets of inputs, the spreadsheet does the following computations
in sequence, closely following the steps in the lecture slides.
(A) Compute the product of importance and preference level for each attribute level.
These are done in cells S20:AD30.
(B) The utility for each product is just the sum of quantities compute in (A) above,
except that we just pick the factor that applies to the particular attribute level that a
certain product has. This is accomplished by doing a matrix multiplication, where we
multiply the matrix obtained in step (A) above by the dummy variable matrix
describing the level taken by each attribute for each product. The dummy variable
matrix is given in cell block H2:J13. Note that this dummy variable matrix is created
by use of a simple IF function based on the attribute levels given in input (1) of the
four inputs. Using the matrix multiplication, we utilities appear in AF21:AH30.
(C) Given the utilities, we multiply by "c" and exponentiate. This is done in cells
AI21:AK30.
(D) We compute the sum of the above exponentials, and this is done in cells
AO21:AO30.
(E) We use the sum to divide each exponential and get the purchase probability.
This is done in cells AP21:AR30
(F) We average these choice probabilities to give the market share estimates for
each producr and this is in cells AP19:AR19
(G) Now we have to compute the margin of the candidate product. For this we need
to pull the price from the candidate product description and this is done in cell R3.
Then we compute the cost, which is done by looking at the candidate product
description and adding the component costs, which is in cell T3. The margin is in cell
V3
(H) The final number we need is Expected Profit per person, which is margin times
share and computed in cell X3
Once you understand this sequence of steps, it should be moderately easy to write R
code or python code to execute the tasks required for Q1 and Q2.
Optional Tasks for Extra Credit toward Class Participation
Optional Task 1: Suppose you want to determine the elimination-by-aspects (EBA)
choice of a consumer choosing from among P products, each having A attributes. This is
a non-compensatory choice prediction model. You are given the following data
structures: a P-by-A matrix containing the consumer's rating or performance of each
product on each attribute, a vector of length A containing the consumer's importance of
each attribute, a vector of length A giving the consumer's cutoff for each attribute (we
are considering the general case where it is possible for the consumer to have different
cutoffs for different attributes). Write a function called "apply_eba" that takes these three
data structures as input arguments and produces the elimination-by-aspects choice. So
your function should be defined like the following

def apply_eba(ratings_matrix, importances_1d_array, cutoffs_1d_array):
__
4/11/23, 5:38 PM 410 Customer Analytics
https://internal.anderson.ucla.edu/faculty/anand.bodapati/410/product-optimization-exercise.htm 5/9
Your function should return an integer from the set {1,2,...,P} corresponding to the
product predicted to be purchased by EBA.
In EBA, if the rating of a product on a certain attribute is greater than or equal to the
cutoff for that attribute, then that product is NOT eliminated. If the rating is strictly less
than the cutoff then the product is eliminated.
It is important to note that in the EBA algorithm, one may run into situations where there
are ties or null sets. Your code needs to handle these as follows. When picking the next-
most important attribute, if there is more than one attribute with the highest level of
importance, then pick one of the attributes randomly with equal probability, and proceed
to eliminate products on the basis of that attribute. When eliminating products that fall
below the performance rating cutoff on a certain attribute, if none of the remaining
products meet the cutoff then pick one of those remaining products randomly with equal
probability and take the resultant product to be the final choice of thatc consumer.
I will give you more points if you write this function without using explicit loops like "for",
"while", "repeat" or recursion. For even more points: Write the function without using
any "if" statement and also without using explicit loops. You can however use implicit
loops like in the "apply" family of functions in R or "map" function of python. Writing the
function will initially seem very complicated. You can write the code using explicit loops
like "for", "while", "repeat" or recursion if you don't want the extra points (beyond the
points you will already get for doing this task), but actually it is easier to write it without
the explicit loops or "if" statements, by using array operations. My own solution program
to this task has only 6 lines of code and does not use explicit loops or the "if" statement.
I mention this to let you know that the code is actually quite simple to write if you think
carefully about EBA and how to structure it as array operations.
Because of the tie-breaking, there is randomness potentially involved so that each run of
the above apply_eba function may give a somewhat different output for the product
chosen. Therefore, in real-life applications, we usually average over multiple runs.
Specifically, we need to run the apply_eba function created above a large number of
times (like 2000) and then aggregate over all runs. For example, if the consumer is seen
to buy product P, Q, R, S, T, respectively, 620, 320, 400, 310, 350 times out of the 2000
runs, then we predict that the consumer's probability of buying P, Q, R, S, T to be
respectively 0.31, 0.16, 0.20, 0.155, 0.175
Demonstrate that your code works correctly on the following two test cases:
Test Case 1: Here P=3 and A=4.
The P-by-A ratings matrix is
1, 7, 7, 7
3, 7, 2, 5
7, 1, 7, 1
The vector of length A containing the importance of each attribute is
[9, 55, 12, 24]
Finally, the vector of length A giving the cutoff for each attribute is
[2, 2, 2, 2]
This test case corresponds to Product-Price Optimization Slides 6 and 7. We already know
from our class discussion that the correct answer is 2.
i
4/11/23, 5:38 PM 410 Customer Analytics
https://internal.anderson.ucla.edu/faculty/anand.bodapati/410/product-optimization-exercise.htm 6/9
Test Case 2: Here P=5 and A=7.
The P-by-A ratings matrix is
1, 1, 2, 3, 4, 5, 5
3, 4, 2, 7, 3, 5, 3
2, 4, 3, 4, 6, 4, 3
4, 1, 4, 3, 5, 6, 6
3, 1, 6, 6, 4, 6, 4
The vector of length A containing the importance of each attribute is
17, 9, 9, 13, 22, 26, 4
Finally, the vector of length A giving the cutoff for each attribute is
[1.5, 1.5, 3.5, 2.5, 1.5, 2.5, 2.5]
This test case is more challenging because there is tiebreaking and randomness
involved. Hint to verify your answer. When you run your function 2000 times, you
should find that the second product is chosen approximately 25% percent of the time, so
that we predict that the consumer's probability of buying the second product to be
approximately 0.25.
Optional Task 2 on Continuous Optimization: Instead of considering just the three
discrete levels for each of the five attributes, consider all possible intermediate levels.
You do not need to consider levels outside of the range of levels given (eg you do not
need to consider prices greater the $30 or less than $5). Treat "Containment" as a
continuous attribute with Slosh Resistant corresponding to level 0, Spill Resistant
corresponding to level 0.5 and Leak Resistant corresponding to level 1. With this setup,
now you will have what is essentially an infinite number of candidate products. For each
such intermediate level not matching the discrete levels provided, compute the
preference level by interpolating between the known levels. Similarly, compute the cost
by interpolating between the known levels. For example, for Time Insulated = 0.75
hours, take the preference level for Consumer 1 to be 2 (interpolating halfway between 1
and 3 because 0.75 is halfway between 0.5 and 1). By similar logic, for Time Insulated =
0.75 hours, the cost would be $0.75. Your task: Use the compensatory model
(interpolation makes programming complicated in the non-compensatory case, and so we
will not consider the non-compensatory model for this question). Identify the product
with the highest value of expected profit per person. For this optimal product, list the
values of the five attributes and its share, price, margin and expected profit per person.
Note that this is a continuous optimization problem and you will have to use an
optimization routine. In R you will need to use something like the "optim" function which
handles bounds and does not need a gradient function to be provided. Warning: you have
to be concerned about local optima and these are to be addressed using multiple starting
values or a stochastic search method. If this task is too complicated for you, you can
consider a simpler version of this task where you are considering a large number
(example d=19) of discrete intermediate values and handle the optimization by discrete
enumeration. The number of candidates will then be (2*d+3)^5 = 115856201 for even a
small number like d=19. A smart way to proceed is to do sequential grid search, starting
with a low value of "d" like d=4 and then examining further discrete values in narrower
range of discrete values, focusing sequentially only near the grid values that are optimal.
In any case, this will involve a lot of computation. In R you can speed up computation by
parallel computing using the "multicore" package or the "parallel" package.
This question involves compensatory analysis with logit adjustment, for which you should
t
4/11/23, 5:38 PM 410 Customer Analytics
https://internal.anderson.ucla.edu/faculty/anand.bodapati/410/product-optimization-exercise.htm 7/9
use a scaling constant value of c=0.0139.
About interpolation: The exact linear-interpolation value for any input value within the
range of the data is determined by the standard linear interpolation formula, described
on this Wikipedia page. You do not have to write code from scratch because it is already
implemented in R and standard python libraries. In fact, interpolation is so widely used
that there are at least three separate interpolation functions in python:
scipy.interpolate.interp1d, numpy.interp and pandas.DataFrame.interpolate. You can use
whichever you want. The advantage of scipy.interp1d over the other two is that it
returns a lambda/function object so it does not repeatedly process the same data and so
should be faster, but its disadvantage is that it is more RAM intensive. For R, interpolation
is built in to the base package, you do not have to load any library. The equivalent of
scipy.interp1d in R is "approxfun" which also returns a lambda/function object. The
equivalent of numpy.interp in R is "approx".
Hint on checking your answer The optimal product has a price between $29.01 and
$29.35 and expected profit per customer between $5.275 and $5.285
Optional Task 3 for the 2-Product-Line decision : Instead of optimizing for just one
product, one can look for the best PAIR of products to launch simultaneously. We will
allow for the possibility that the second product in the pair is the empty product or NULL
product, so effectively the firm is offering just a single product rather than two distinct
products. For simplicity we will consider the case where we are choosing from just
among the original 243 products. The total number of pairs of products where both
products are non-NULL is 243*(243-1)/2 = 29403. The number of pairs where one of
the products is non-NULL is of course just 243 and these you have already evaluated in
Mandatory Question 2 above. Your goal in this Optional Task is to identify the candidate
pair of products X and Y with the highest total expected profit per person (adding the
expected profit per person for product X and for product Y) using the compensatory rule
with logit adjustment. For this optimal pair of products, list for each product in the pair
its values of the five attributes and its share, price, margin and expected profit per
person. Important note: The best pair is not necessarily the two products in Task 2 above
with the two highest values of expected profit per person. To identify the best pair you
need to (i) enumerate each of the 29403 pairs, (ii) for each such pair, labeling the
products as 3 and 4, compute the probabilities of buying products 1, 2, 3 and 4 , where
product 1 and 2 are from Brands A and B as before but now there are TWO products, 3
and 4, from Brand C, (iii) for each pair compute the total of expected profit per person
over both 3 and 4 in the pair, (iv) pick the pair with the highest total expected profit per
person.
What to submit: List out the 4 product pairs with the highest total expected profit per
person in decreasing order of total expected profit per person. You can identify each pair
by the candidate index values in the lexical order (the same order used in Question 2).
For each of these four pairs, give the total expected profit per person.
Hint to check your answer: The product pair with the fifth-highest total expected profit
consists of Candidate 36 and Candidate 72, and its total expected profit per person is
between 7.493 and 7.496.
A clever trick to accelerate the process us by skipping steps (ii) and (iii) above for a large
fraction of pairs based on the results from Mandatory Question 2. This is because the
if
4/11/23, 5:38 PM 410 Customer Analytics
https://internal.anderson.ucla.edu/faculty/anand.bodapati/410/product-optimization-exercise.htm 8/9
profit contribution of a candidate in the three product case is an upper bound for the four
product case. So we can drop from consideration any pair of candidates whose sum in
Mandatory Question 2 is less than the value for the best product of Mandatory Question
2. In fact, we can improve upon this trick by increasing the upper bound progressively
as we iterate through the pairs.
This question involves compensatory analysis with logit adjustment, for which you should
use a scaling constant value of c=0.0139.
Optional Task 4 on using generative AI tools to produce code: For each of the HW
problems listed below, use a generative AI tool to produce python or R code. The code
should be able to solve the HW problem exactly, it should not be code that solves some
different, though related, problem.
1. Q1.1 and Q1.2 of the Bass Noise-Robust Estimation Exercise
2. Q2 (see above) of the current HW
3. Optional Task 1 (see above) of the current HW
For each of the above HW problems, submit the following in a PDF file: (A) The prompt or
sequence of prompts that you entered into the generative tool, (B) an identification of
which sequence of prompts was entered into which generative tool (like ChatGPT, Bing
Chat, Bard, GitHub Copilot), (C) generative tool's output from each prompt, identifying
which output corresponds to which prompt and which generative tool.
What and Where to Submit
For Q1 through Q4 and also Optional Task 1, submit the following:
1. A PDF file containing your responses. Please note that this file should be PDF only,
not a Word document for example.
2. Your R or python code in the zip file form I specified for the previous HW.
3. A listing of the 243 products from Q2 along with the four metrics ( the share, the
price,the margin and the expected profit per person) as a CSV file that is separate
from the PDF file.
Submission links for Q1 through Q4 and Optional Task 1:
Submission Link for the 12:40pm section
Submission Link for the 4:10pm section
If you plan to respond to Optional Task 2 or Optional Task 3, then submit the following:
1. A PDF file containing your responses. Please note that this file should be PDF only,
not a Word document for example.
2. Your R or python code in the zip file form I specified for the previous HW.
Submission link for Optional Tasks 2 and 3:
Submission Link for the 12:40pm section
Submission Link for the 4:10pm section
to
4/11/23, 5:38 PM 410 Customer Analytics
https://internal.anderson.ucla.edu/faculty/anand.bodapati/410/product-optimization-exercise.htm 9/9
If you plan to respond to Optional Task 4, then submit the PDF file with the contents
specified in that task.
Submission link for Optional Task 4:
Submission Link for the 12:40pm section
Submission Link for the 4:10pm section