Customer Analytics (Practice) Final Exam – 70 minutes BU.450.760.K5
NO COMMUNICATION WITH OTHERS IS PERMITTED
1
The Johns Hopkins Carey Business School
Honor Code
The Carey Business School measures success by the way a Carey graduate stands out as an
innovative business leader and exemplary citizen. The Carey community believes that honesty,
integrity, and community responsibility are qualities inherent in an exemplary citizen. The objective
of the Carey Business School Honor Code is to create an environment of trust among all members
of the academic community while the qualities associated with success are developed in students.
The Honor Code requires that each student act with honesty and integrity in all academic and co-
curricular activities and that each student endeavor to hold his or her peers to the same standard.
Upon witnessing an alleged violation of the Honor Code, a student is expected to inform either the
responsible faculty member or the Honor Council of both the alleged violation and the name of the
student accused of committing the alleged violation. Each member of the Carey community, as a
person of integrity, has a personal obligation to adhere to this requirement. It is only by upholding
the Honor Code that members of the entire Carey community can contribute to the School’s ability
to maintain its high standards and its reputation.
Violations of this agreement are viewed as serious matters that are subject to disciplinary sanctions
imposed by the Honor Council of the Carey Business School, which is composed of a fair
representation of part-time and full-time MBA, MS, BS and BBA students and faculty members.
INSTRUCTIONS
• No interpersonal communication.
• To answer questions, make assumptions if necessary.
• Fill-in your answers into the “EXAM_ANSWERS.docx” template. Do not exceed the allotted
number of lines.
• Continuously save your work. Make sure you upload the correct file and that upload is
successful.
• Submit the file with answers via the “Final exam” link in the assignments tab in Blackboard.
This link expires 2 minutes after due time. In this event, email submission to instructor
(jzliu@jhu.edu). Late submissions face a per-minute point penalty.
Customer Analytics (Practice) Final Exam – 70 minutes BU.450.760.K5
NO COMMUNICATION WITH OTHERS IS PERMITTED
2
1. [8 points] Consider the following sample corpus from Yelp. Each row (review) is a
document. Assume the list of stopwords = c(“so”, “or”, “when” “and”, “the”) and non-
words contain white space, punctuation, numbers, and symbols (e.g. $).
[1] All the food is great here. But the best thing they have is their wings. Their wings are simply fantastic!!
[2] This place is truly a Yinzer's dream!! \"Pittsburgh Dad\" would love this place n'at!!
[3] Wing sauce is like water. Pretty much a lot of butter and some hot sauce (franks red hot maybe).
[4] The whole wings are good size and crispy, but for $1 a wing the sauce could be better.
[5] The fish sandwich is good and is a large portion, sides are decent.
(1) [4 points] After removing non-words and stopwords, what is the TF-IDF score of the
term “good” in doc 4?
(2) [4 points] If we use this corpus to predict restaurants’ survival rate, is there a “wide X”
problem? Why or why not. Please explain.
2. [8 points] Suppose that your work for the marketing division of the athletic apparel
company Reebok. You are now discussing the allocation of advertising dollars. In a
meeting, the chart shown below is presented. This chart describes the relationship between
the number of times Facebook users see an ad for Reebok shoes (horizontal axis) and the
probability that users will purchase a pair of Reebok shoes after clicking on the link
(vertical axis). Your colleague presents this figure in a meeting, arguing that it provides
“undisputable evidence that advertising on Facebook pays-off” and that the company
should “probably increase the number of advertising dollars in this platform.” Do you
agree? Explain your argument.
Customer Analytics (Practice) Final Exam – 70 minutes BU.450.760.K5
NO COMMUNICATION WITH OTHERS IS PERMITTED
3
3. [8 points] The blue line in the graph below represents the outcomes of a set of units
affected by a shock (ie, “as if” natural experiment) that unfolded in week 60 of the dataset
at hand. To evaluate the impact of this shock on treated units, we would like to implement a
diff-in-diff analysis, which requires us to select a control series. Our data contains two
candidate control series, controls #1 and #2, respectively shown by the red and green lines.
Which of these two controls would you select to implement the diff-in-diff analysis? Justify
your answer.
4. [6 points] The figure below represents the stylized impact of a shock (ie, natural
experiment) on a set of treated units (in red). Blue markers represent the outcomes that are
observed for a (adequate) control.
What are implied values for parameters !, ", #, $ in the below diff-in-diff equation?
= ! + " + # + $ × +
Customer Analytics (Practice) Final Exam – 70 minutes BU.450.760.K5
NO COMMUNICATION WITH OTHERS IS PERMITTED
4
5. [10 points] Suppose that you manage marketing campaigns for a subscription-based
business. You customer base is described by three segments, as shown by the table below.
You are considering a campaign that consists in sending out a one-time gift by the amount
of $100 (e.g., champaign bottle). It is believed that this gift discount would reduce the
churn probability of segment 1 customers by 0.05, of segment 3 customers by 0.03, and of
segment 3 customers by 0.01.
• [5 points] Would it make sense to adopt this campaign (all segments receive the
gift)? What would be the impact on the valuation of the firm’s customer
portfolio?
• [5 points] Judging from your results, should the company consider an
alternative targeting policy? What would be the impact on the valuation of the
firm’s customer portfolio in this case?
Notes: (i) Use a discount factor of 0.97, (ii) Work in excel but paste your results into the
document containing your answers.
Segment % of
customer
base
Churn
probability
Net contribution
1 30% 0.5 100
2 30% 0.25 200
3 40% 0.1 400