计量经济代写-ECOM30001/ECOM90001|学霸联盟

计量经济代写-ECOM30001/ECOM90001

时间：2022-06-10

Department of Economics
The University of Melbourne
ECOM30001/ECOM90001: Basic Econometrics
Semester 1, 2022
Solutions: Tutorial 12
This tutorial reviews some concepts using the econometrics software package R. Specifi-
cally, the tutorial reviews:
- methods for estimating and interpreting the Linear Probability Model (LPM) in R
- methods for estimating and interpreting the Probit Model in R
- calculating marginal (partial) effects for the Probit Model in R
This tutorial requires one (1) data file:
- tut12 churn.csv
This file can be obtained from the Canvas subject page.
In addition, the R script file tut12.R provides the program code necessary to complete
the tutorial. This R script file uses the following package(s) which need to be installed
prior to running the R script file:
stargazer : for easily producing output tables in R
ggplot2 : for easily producing graphs in R
car : for easily conducting hypothesis tests in R
lmtest : for easily conducting the Ramsey RESET test in R
sandwich : for easily calculating robust (Huber-White) heteroskedasticty consistent
standard errors in R
margins : for easily calculating marginal effects in R
These can be installed directly in RStudio from the packages tab or by using the com-
mand install.packages() and inserting the name of the package in the brackets.
1
Question 1
Suppose you have been engaged as a consultant for a large telecommunications provider.
Your current role is to investigate the characteristics of customers who decide to switch
to another competitor.
Consider the following econometric model for customer churn in a clearly defined narrow
geographic segment of the market:
churn∗i = β0 + β1 partneri + β2 dependentsi + β3 tenurei + β4 tenure
2
i
+ β5 monthlychargesi + β6 contract2i + β7 contract3i + β8 paperlessi + εi (1)
where ε|Xi ∼ N (0, σ2ε) and:
churn∗ = latent variable determining whether a customer switched
providers in the last month
partner = 1 if customer has a partner, 0 otherwise
dependents = 1 if customer has dependents, 0 otherwise
tenure = number of months customer has stayed, 0 otherwise
monthlycharges = monthly account charges, in dollars
contract1 = 1 if customer does not have a contract (‘month-to-month’), 0 otherwise
contract2 = 1 if customer has a 12-month contract, 0 otherwise
contract3 = 1 if customer has a 24-month contract, 0 otherwise
paperless = 1 if customer receives bills electronically, 0 otherwise
Note that contract1 is the omitted category.
a) Provide a brief interpretation of the latent variable churn∗
Solution: Suppose, each month, customers weigh up the advantages and disadvan-
tages of remaining with the provider. They would trade-off the cost of the service,
against the quality and reliability of the service. In principle, customers (implicitly)
calculate a score that determines their decision to remain with the provider. Let
this ‘customer dissatisfaction’ score be denoted by churn∗i . Customers that perceive
the benefits of remaining with the provider exceed the disadvantages will choose to
remain with the provider for another month (churn∗i sufficiently low). Similarly, cus-
tomers that perceive the benefits of remaining with the provider do not exceed the
disadvantages will choose to switch providers (churn∗i sufficiently high). There will
be some implied threshold level where customers are indifferent to remaining with
the provider, or switching providers. For example, there might be some switching
rule as follows:
churni =
{
1 if churn∗i ≥ H
0 if churn∗i < H
where H represents the threshold ‘cut-off’ level for switching providers, based upon
2
the sufficiently high level of the ‘customer dissatisfaction’ variable churn∗i .
b) The data set tut12 churn contains 7, 043 observations on the population of cus-
tomers for this large telecommunications provider in a specific geographic segment.
This data contain the following indicator variable:
churni =
{
= 1 if churn∗i ≥ 0
= 0 if churn∗i < 0
This suggests the following econometric model:
churni = β0 + β1 partneri + β2 dependentsi + β3 tenurei + β4 tenure
2
i
+ β5 monthlychargesi + β6 contract2i + β7 contract3i + β8 paperlessi + εi
(2)
3
Estimate model (2) by Ordinary Least Squares (OLS).
Solution: The estimation results are provided in Figure 1:
i) What is the interpretation of your estimate for β5? What is your interpreta-
tion of your estimate of β8?
Solution: The econometric model (2) is a linear model. Recall also that for
a binary dependent variable:
E[churni|Xi] = Pr[churni = 1|Xi]
so the interpretation of β5: the effect on Pr[churni = 1|Xi] associated with
a one dollar increase in the monthly charges, holding all other variables con-
stant. Note that the magnitude of the marginal effect is in terms of probability
points—a one dollar increase in monthly charges, holding all other variables
constant, changes the probability of churn by 100 ∗ β5 probability points. The
results in column (1) of Figure 1 provide the estimate b5 = 0.0032 so a one dol-
lar increase in monthly charges, holding all other variables constant, changes
the probability of churn by 0.32 probability points.
The parameter β8 is the coefficient attached to the paperless indicator vari-
able. Consequently, it has the ‘usual’ interpretation for a linear model:
β8 = Pr[churni = 1|paperlessi = 1,Xi]− Pr[churni = 1|paperlessi = 0,Xi]
so the interpretation of β8 is the difference in Pr[churni = 1|Xi] for customers
who receive electronic bills, relative to customer who do not receive electronic
bills, holding all other variables constant. Note that the magnitude of the
marginal effect is in terms of probability points—holding all other variables
constant, the probability of churn for customers who receive electronic bills
is 100 ∗ β8 different to that for customers who do not receive electronic bills.
The results in column (1) of Figure 1 provide the estimate b8 = 0.0717 so,
holding all other variables constant, customers who receive electronic bills,
have a probability of churn that is higher by 7.17 probability points, relative
to customers who receive paper bills.
ii) At the 5% level, test for the presence of heteroskedasticity using White’s test.
Since there are numerous indicator variables in model (2), use the ‘no cross
products’ form of White’s test. Is there any evidence of heteroskedasticity?
Solution: The OLS estimation results are presented in column 1 of Figure 1
The null hypothesis is that the errors are homoskedastic against the alternative
hypothesis that the errors are heteroskedastic. The test statistic is calculated
from the auxiliary regression:
eˆ2i = α0 + α1 partneri + α2 dependentsi + α3 tenurei + α4 tenure
2
i
+ α5 monthlychargesi + α6 contract2i + α7 contract3i + α8 paperlessi
+ α9 tenure
4
i + α10 monthlycharges
2
i + υi
4

!"!#
$%&'(
)))
$%&'(
)))
$%$*& $%$(
+,! $%$$-. $%$$-.
$%$$ $%$
/!#!0!
1$%$*&
)))
1$%$*&
)))
$%$( $%$$2
3!!
1$%$-
)))
1$%$-
)))
$%$$$2 $%$$$2
3!!4,!0
$%$$$
)))
$%$$$
)))
$%$$$$ $%$$$$
567,8!
$%$$&
)))
$%$$&
)))
$%$$$ $%$$$
7!,,"
1$%(2-
)))
1$%(2-
)))
$%$&. $%$&
7!,,"
1$%2.
)))
1$%2.
)))
$%$'& $%$&&
+,#!6!9:6
$%$..
)))
$%$..
)))
$%$$ $%$$-2
;,::" -$%'*$ &'&%**&
;#<,6! $ $
;=0> 2 2
;0!=0> .$&* .$&*
!<,: .?$*& .?$*&

$%*2* $%*2*
@0A!0

$%*.' $%*.'
!:0,60%B0>C.$&* $%&2&$ $%&2&$
DE
)
#F$%$(G
))
#F$%$G
)))
#F$%$$
Figure 1: OLS Estimates for Model (2)
Untitled
studentized Breusch‐Pagan test
data:  lpm
BP = 1619.8, df = 10, p‐value < 0.00000000000000022
Page 1
Figure 2: Model 2: White Test for Heteroskedasticity
5
050
100
150
−0.200 −0.100 0.000 0.100 0.200 0.300 0.400 0.500 0.600 0.700 0.800 0.900 1.000 1.100 1.200
Predicted Probability churn=1
Co
un
t
Figure 3: Predicted Values for OLS Model 2
where eˆ2i represent the OLS residuals from model (2). The test statistic is
N R2 ∼ χ2(K) where N is the sample size. The test statistic is asymptotically
distributed as χ2 where the df is the number of parameters (excluding the
constant) to be estimated in the auxiliary regression. Since there are so many
dummy variables in the model, the White test with no cross product terms is
conducted, (K) = 10.
The R output from the bptest command is presented in Figure 2. This
provides a value for the test statistic as N R2 = 1619.8 with a p-value of
0.0000. Since p < 0.05—reject H0. The sample is not consistent with the
hypothesis of homoskedastic errors. This should come as no surprise since we
already know that the linear probability model will exhibit heteroskedasticity
and we know the exact form of this heteroskedasticity.
iii) Calculate the predicted values for churn in model (2). How many observations
have predicted values less than zero? How many observations have predicted
values greater than one? Is this a problem? Why?
Hint: Create two indicator variables. The sample means of these dummy vari-
ables will tell you the proportion of observations with negative predicted values
(lpm preds neg) or predicted values greater than one (lpm preds ones):
tut12$lpm preds ones <- as.numeric(tut12$lpm preds>1)
print(summary(tut12$lpm preds ones))
tut11$lpm preds neg <- as.numeric(tut12$lpm preds<0)
print(summary(tut12$lpm preds neg))
Solution: The predicted values of churn for model (2) have a sample mean of
0.26537 with a maximum value of 0.71371 and a minimum value of -0.19688.
Creating a dummy variable lpm preds ones, we can see that the estimated
model produces a predicted probability greater than one for none of the obser-
vations in the sample. Similarly, creating a dummy variable lpm preds neg,
we can see that the estimated model produces a predicted probability less than
zero for 946 observations (out of 7,043 total observations or 13.43% of obser-
vations). This is quite large, for over 10% of observations, the OLS model
produces a predicted probability that is less than zero. Why is this a prob-
lem? The predicted values for model (2) represent predicted probabilities. A
negative predicted probability makes no sense in light of this interpretation.
Moreover, these negative probabilities would imply negative estimated error
variances for these observations. In this case, a FGLS transformation would
6
Untitled
# Predicted Values for OLS model
churn$lpm_preds <‐lpm$fitted.values
print(summary(churn$lpm_preds))
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max.
‐0.19688  0.09831  0.26642  0.26537  0.44438  0.71371
#‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
# create dummy variables
# dummy variable if predicted values > 1
churn$lpm_preds_ones <‐ as.numeric(churn$lpm_preds>1)
print(summary(churn$lpm_preds_ones))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
      0       0       0       0       0 0
lpm_preds_ones_tab <‐ table(churn$lpm_preds_ones)
lpm_preds_ones_tab
   0
7043
# dummy variable if predicted values < 0
churn$lpm_preds_neg <‐ as.numeric(churn$lpm_preds<0)
print(summary(churn$lpm_preds_neg))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
0.0000  0.0000  0.0000  0.1343  0.0000  1.0000
lpm_preds_ones_neg_tab <‐ table(churn$lpm_preds_neg)
lpm_preds_ones_neg_tab
   0    1
6097  946
Page 1
Figure 4: Model 2: Summary Statistics for Predicted Values
not be defined for these observations.
iv) Report the estimates for model (2) that use the Huber-White (robust) standard
errors. Compare and contrast the OLS standard errors with these Robust
standard errors.
Solution: The estimation results, using the robust variance estimator are
provided in column 2 in Figure 1. Overall, the White standard errors are
similar to the conventional standard errors (that assume homoskedasticity).
In particular, there is no qualitative difference in the p-values for the two-
sided test about zero for the estimated coefficients. The same substantive
conclusions regarding statistical significance would be obtained if we used the
OLS standard errors or the White standard errors. However, since we know
there is heteroskedasticity in the linear probability model, the conventional
OLS standard errors are invalid for statistical inference. The White standard
errors should be used for conducting hypothesis tests.
v) Based on your estimation results that use the robust variance estimator, test
whether the estimated model should include a quadratic term in tenure, using
a 5% level of significance.
Solution: A test of whether the model should also include a quadratic term
in education is a test of the null hypothesis H0 : β4 = 0 against the alternative
hypothesis β4 6= 0. in model (2). The estimation results are reported in Figure
7
Untitled
Call:
glm(formula = churn ~ factor(partner) + factor(dependents) +
    tenure + I(tenure^2) + monthlycharges + factor(contract2) +
    factor(contract3) + factor(paperless), family = binomial(link = "probit"),
    data = churn)
Deviance Residuals:
    Min 1Q   Median 3Q      Max
‐1.8851  ‐0.7426  ‐0.3107   0.7729   3.7753
Coefficients:
Estimate  Std. Error z value Pr(>|z|)
(Intercept) ‐0.82878677  0.05797610 ‐14.295 < 0.0000000000000002 ***
factor(partner)1     0.04305360  0.04413219   0.976 0.329283
factor(dependents)1 ‐0.17163919  0.04914763  ‐3.492 0.000479 ***
tenure ‐0.04422843  0.00321537 ‐13.755 < 0.0000000000000002 ***
I(tenure^2) 0.00037937  0.00004731   8.019  0.00000000000000106 ***
monthlycharges 0.01516475  0.00081032  18.715 < 0.0000000000000002 ***
factor(contract2)1  ‐0.57125411  0.05713456  ‐9.998 < 0.0000000000000002 ***
factor(contract3)1  ‐1.17528843  0.08833733 ‐13.305 < 0.0000000000000002 ***
factor(paperless)1   0.27845126  0.04175776   6.668  0.00000000002588672 ***
‐‐‐
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
    Null deviance: 8150.1  on 7042  degrees of freedom
Residual deviance: 6032.8  on 7034  degrees of freedom
AIC: 6050.8
Number of Fisher Scoring iterations: 6
Page 1
Figure 5: Probit Estimates for Model (2)
1. The sample test statistic will follow a t-distribution with (N − K − 1) =
7, 034 degrees of freedom, with a critical value of tc ≈ 1.96. The decision rule
is to reject H0 is t > tc or t < −tc. The test statistic is calculated as:
t =
b4 − 0
se(b4)
=
−0.000099378
0.000010143
= 9.7976
The p-value for the t-test is p = 0.0000. At the 5% level, the null hypothesis
should be rejected. The data are consistent with a quadratic relationship
between tenure and the probability of churn.
c) Estimate model (2) as a Probit model.
Solution: The Probit estimation results are provided in Figure 5:
i) Estimate pˆi, the predicted probability that a customer switches providers, for
each individual in the sample. Consider the following decision rule: if pˆi ≥ 0.5,
predict that ĉhurni = 1, otherwise ĉhurni = 0. What percentage of outcomes
are successfully predicted? How does your answer change if the decision rule
cutoff is set at pˆi ≥ 0.6.
8
true predicted frequency correctly
predicted
0 0 4,678 yes
1 0 970 no
0 1 496 no
1 1 899 yes
Table 1: Predicted Probability Threshold pˆi ≥ 0.5
true predicted frequency correctly
predicted
0 0 4,950 yes
1 0 1,273 no
0 1 224 no
1 1 596 yes
Table 2: Predicted Probability Threshold pˆi ≥ 0.6
0.5 Threshold
The R output for a threshold of pˆi ≥ 0.5 is presented in Table 1. There are
5,174 observations with a ‘true’ value of churni = 0. Of these 5,174 observa-
tions, 4,678 observations are (correctly) predicted with a predicted probability
P̂r(churni = 0), according to our rule. Consequently, when the threshold is set
at pˆi ≥ 0.5, the estimated model predicts that (4, 678/5, 174) = 90.41% of the
observations with churni = 0 are predicted correctly (based upon the values
for their explanatory variables). The model seems to be doing a decent job at
predicting churni = 0, the probability of not switching providers.
There are 1,869 observations with a ‘true’ value of churni = 1. Of these 1,869
observations, 899 observations are (correctly) predicted with a predicted prob-
ability P̂r(churni = 1). Consequently, when the threshold is set at pˆi ≥ 0.5, the
estimated model predicts that (899/1, 869) = 48.10% of the observations with
churni = 1 are predicted correctly (based upon the values for their explanatory
variables). The model is not doing as good job at predicting churni = 1, the
probability of switching providers.
0.6 Threshold
The R output for a threshold of pˆi ≥ 0.6 is presented in Table 2. There are
5,174 observations with a ‘true’ value of churni = 0. Of these 5,174 observa-
tions, 4,950 observations are (correctly) predicted with a predicted probability
P̂r(churni = 0), according to our rule. Consequently, when the threshold is set
at pˆi ≥ 0.6, the estimated model predicts that (4, 950/5, 174) = 95.67% of the
observations with churni = 0 are predicted correctly (based upon the values
9
for their explanatory variables). The model seems to be doing a decent job at
predicting churni = 0, the probability of not switching providers.
There are 1,869 observations with a ‘true’ value of churni = 1. Of these 1,869
observations, 596 observations are (correctly) predicted with a predicted prob-
ability P̂r(churni = 1). Consequently, when the threshold is set at pˆi ≥ 0.6, the
estimated model predicts that (596/1, 869) = 31.89% of the observations with
churni = 1 are predicted correctly (based upon the values for their explana-
tory variables). The model is not doing as good job at predicting churni = 1,
the probability of switching providers. Moreover, (as expected) raising the
threshold considerably lowers the ability of the model to predict churni = 1.
ii) Provide an expression for the marginal effect for tenure with the provider.
Using the margins package, calculate the average marginal effect for tenure
with the provider.
Solution: Using the latent variable model (1), and in particular the assump-
tion that εi|Xi ∼ N (0, σ2), the response probability is given by:
Pr(churni = 1|Xi) = Φ
(
β0
σ
+
β1
σ
partneri +
β2
σ
dependentsi +
β3
σ
tenurei +
β4
σ
tenure2i
+
β5
σ
monthlychargesi +
β6
σ
contract2i +
β7
σ
contract3i +
β8
σ
paperlessi
)
where Φ(·) represents the standard normal distribution function.
The variable tenure is a continuous variable. First, note that the observed
value of churn can only take two values (zero or one) so:
E[churni|Xi] = {Pr(churni = 0|Xi) ∗ 0}+ {Pr(churni = 1|Xi) ∗ 1}
= Pr(churni = 1|Xit)
This implies that the marginal effect is given by:
∂E[churni|Xi]
∂tenurei
=
∂Pr(churni = 1|Xi)
∂tenurei
so:
∂E[churni|Xi]
∂tenurei
=
∂Pr(churni = 1|Xi)
∂tenurei
=
(
β3 + 2 β4 tenurei
σ
)
φ
(
β0
σ
+
β1
σ
partneri +
β2
σ
dependentsi +
β3
σ
tenurei
+
β4
σ
tenure2i
+
β5
σ
monthlychargesi +
β6
σ
contract2i +
β7
σ
contract3i +
β8
σ
paperlessi
)
where φ(·) is the probability density function for the standard normal distri-
10
Untitled
margins_summary(probit)
factor     AME     SE        z      p   lower   upper
     contract21 ‐0.1330 0.0123 ‐10.8071 0.0000 ‐0.1571 ‐0.1088
     contract31 ‐0.2369 0.0128 ‐18.5098 0.0000 ‐0.2620 ‐0.2118
    dependents1 ‐0.0412 0.0116  ‐3.5378 0.0004 ‐0.0640 ‐0.0184
monthlycharges  0.0037 0.0002  20.2977 0.0000  0.0033  0.0040
     paperless1  0.0673 0.0100   6.7526 0.0000  0.0477  0.0868
partner1  0.0104 0.0107   0.9774 0.3283 ‐0.0105  0.0313
tenure ‐0.0064 0.0003 ‐19.9063 0.0000 ‐0.0070 ‐0.0057
# ame for specific variable: check interactions
summary(margins(probit,variables="tenure"))
factor     AME     SE        z      p   lower   upper
tenure ‐0.0064 0.0003 ‐19.9063 0.0000 ‐0.0070 ‐0.0057
Page 1
Figure 6: Average Marginal Effects (Probit) for Model (2)
Untitled
factor  tenure     AME     SE z      p   lower   upper
tenure  0.0000 ‐0.0138 0.0010 ‐14.0965 0.0000 ‐0.0158 ‐0.0119
tenure  5.0000 ‐0.0125 0.0009 ‐14.2834 0.0000 ‐0.0142 ‐0.0108
tenure 10.0000 ‐0.0109 0.0007 ‐15.1520 0.0000 ‐0.0123 ‐0.0095
tenure 15.0000 ‐0.0093 0.0006 ‐16.5727 0.0000 ‐0.0104 ‐0.0082
tenure 20.0000 ‐0.0077 0.0004 ‐18.1760 0.0000 ‐0.0085 ‐0.0068
tenure 25.0000 ‐0.0062 0.0003 ‐18.8662 0.0000 ‐0.0069 ‐0.0056
tenure 30.0000 ‐0.0049 0.0003 ‐17.1425 0.0000 ‐0.0055 ‐0.0043
tenure 35.0000 ‐0.0038 0.0003 ‐13.2963 0.0000 ‐0.0043 ‐0.0032
tenure 40.0000 ‐0.0028 0.0003  ‐9.1463 0.0000 ‐0.0034 ‐0.0022
tenure 45.0000 ‐0.0019 0.0003  ‐5.6731 0.0000 ‐0.0026 ‐0.0013
tenure 50.0000 ‐0.0012 0.0004  ‐2.9949 0.0027 ‐0.0019 ‐0.0004
tenure 55.0000 ‐0.0005 0.0005  ‐1.0005 0.3171 ‐0.0013  0.0004
tenure 60.0000  0.0002 0.0005   0.4349 0.6636 ‐0.0008  0.0013
tenure 65.0000  0.0009 0.0007   1.4263 0.1538 ‐0.0003  0.0022
tenure 70.0000  0.0017 0.0008   2.0821 0.0373  0.0001  0.0033
Page 1
Figure 7: Average Marginal Effects (Probit) for Model (2)
bution. For a standard normal random variable (Z), the probability density
function is given by:
φ(Z) =
1√
2 pi
exp
(−Z2
2
)
The average marginal effects, calculated using the margins package are pre-
sented in Figure 6. The average marginal effect for tenure is -0.0064—a one-
month increase in tenure with the provider reduces the probability of switching
providers by 0.64 probability points.
iii) Using the margins package and ggplot2, calculate and plot the average marginal
effect for tenure for each value for tenure from 0 to 70 months. Provide a brief
discussion of the results.
Solution: As shown above for part (c)(ii), the marginal effect for tenure with
the provider depends upon the level of tenure. The overall average marginal
effect for the full sample is give by ame = -0.0064. An examination of Figure
11
−0.020
−0.015
−0.010
−0.005
0.000
0.005
0.010
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70
Tenure with Provider
Av
e
ra
ge
M
ar
gi
na
l E
ffe
ct Average Marginal Effect for Tenure, with 95% Confidence Intervals
Figure 8: Average Marginal Effects (Probit) for Model (2)
7 and Figure 8 reveal that the average marginal effect (ame) for tenure is
generally increasing in the level of tenure. For a value of tenure = 0, the ame
= -0.0138, for a value of tenure = 10, the ame = -0.0109, for a value of tenure
= 30, the ame = -0.0049, and for a value of tenure = 70, the ame = 0.0017.
For values of tenure less than 55 months, the ame is significantly negative.
This can also be seen in Figure 8 which provides the 95% confidence intervals
around the estimated marginal effects.
iv) Calculate the marginal effect for tenure with the provider for a customer with
a partner, but no dependents, with a tenure of 30 months, with monthly
charges of $70 per month, on a month-to-month contract, and receiving bills
electronically.
Solution: The marginal effect for tenure is given by:
∂E[churni|Xi]
∂tenurei
=
∂Pr(churni = 1|Xi)
∂tenurei
=
(
β3 + 2 β4 tenurei
σ
)
φ
(
β0
σ
+
β1
σ
partneri +
β2
σ
dependentsi +
β3
σ
tenurei
+
β4
σ
tenure2i
+
β5
σ
monthlychargesi +
β6
σ
contract2i +
β7
σ
contract3i +
β8
σ
paperlessi
)
where φ(·) is the probability density function for the standard normal distri-
bution. For a standard normal random variable (Z), the probability density
function is given by:
φ(Z) =
1√
2 pi
exp
(−Z2
2
)
Now evaluate this at partner=1, dependents=0, tenure = 30, monthlycharges
12
Untitled
# short way
> summary(margins(probit,variables="tenure", at=list(partner =1, dependents = 0,
+ tenure = 30, monthlycharges = 70, contract2 = 0,
+ contract3 = 0, paperless = 1)))
factor partner dependents  tenure monthlycharges contract2 contract3 paperless
1.0000     0.0000 30.0000        70.0000    0.0000    0.0000     1.0000
AME     SE        z      p    lower   upper
tenure  ‐0.0078 0.0004 ‐17.8354 0.0000 ‐0.0087 ‐0.0069
Page 1
Figure 9: Model 2: Marginal Effect for tenure Evaluated at Selected Characteristics
= 70, contract2 = 0, contract3=0, and paperless = 1.
∂E[churni|Xi]
∂tenurei
=
∂Pr(churni = 1|Xi)
∂tenurei
=
(
β3 + 2 ∗ 30 β4
σ
)
φ
(
β0
σ
+
β1
σ
+
30 ∗ β3
σ
+
900 ∗ β4
σ
+
70 ∗ β5
σ
+
β8
σ
)
Recall the Probit parameter estimates are provided in ratio form (scaled by
the error standard deviation variance σ):
b0 =
(̂
β0
σ
)
b1 =
(̂
β1
σ
)
b2 =
(̂
β2
σ
)
b3 =
(̂
β3
σ
)
b4 =
(̂
β4
σ
)
b5 =
(̂
β5
σ
)
b6 =
(̂
β6
σ
)
b7 =
(̂
β7
σ
)
b8 =
(̂
β8
σ
)
Substituting for the Probit estimates from Figure 5:
∂E[churni|Xi]
∂tenurei
=
∂Pˆr(churni = 1|Xi)
∂tenurei
= (b3 + 60 ∗ b4)φ (b0 + b1 + [30 ∗ b3] + [900 ∗ b4] + [70 ∗ b5] + b8)
= −0.02146607φ (−0.431167)
We can calculate φ(−0.431167) as:
φ(−0.431167) = 1√
2pi
exp
(−(−0.431167)2
2
)
= 0.3635309
The marginal effect for this representative customer is given by (−0.02146607∗
0.3635309) = −0.0078. An additional one month increase in tenure with the
provider reduces the probability of switching by approximately 0.78 probability
points, for an individual with the selected characteristics (an individual with a
partner but no dependents, a tenure of 30 months with the provider, a monthly
13
account of $70, and currently receiving paperless bills).
v) Provide an expression for the marginal effect for paperless bills. Using the
margins package, calculate the average marginal effect for paperless bills.
Aside: When Xj is an indicator variable (say Di) it is not appropriate to use
a derivative to measure the marginal effect derived above since the derivative
measures the effect of an infinitesimal change in the variable Xj on the condi-
tional mean. For an indicator variable the only changes possible are from (0
to 1) or from (1 to 0) which represent very large changes in Xj. In this case it
is possible to approximate the derivative as a discrete change:
∂E[yi|Xi]
∂Di
=
∂Pr(yi = 1|Di)
∂Di
≈
∆Pr(yi = 1|Xi)
∆Di
= ∆Pr(yi = 1|Xi)
= Pr(yi = 1|Di = 1,Xi)− Pr(yi = 1|Di = 0,Xi)
where the last equality follows because Di is an indicator variable so the change
(∆Di) is always one unit.
14
Solution: We have:
∂Pr(churni = 1|Xi)
∂paperlessi
≈ ∆Pr(churni = 1|Xi)
= {Pr(churni = 1|Xi)}paperlessi=1 − {Pr(churni = 1|Xi)}paperlessi=0
= Φ
(
β0 + β1 partneri + β2 dependenentsi + β3 tenurei
σ
+
β4 tenure
2
i + β5 monthlychargesi + β6 contract2i
σ
+
β7 contract3i + β8 paperlessi
σ
)
paperlessi=1
− Φ
(
β0 + β1 partneri + β2 dependenentsi + β3 tenurei
σ
+
β4 tenure
2
i + β5 monthlychargesi + β6 contract2i
σ
+
β7 contract3i + β8 paperlessi
σ
)
paperlessi=0
= Φ
(
β0 + β1 partneri + β2 dependenentsi + β3 tenurei
σ
+
β4 tenure
2
i + β5 monthlychargesi + β6 contract2i
σ
+
β7 contract3i + β8
σ
)
− Φ
(
β0 + β1 partneri + β2 dependenentsi + β3 tenurei
σ
+
β4 tenure
2
i + β5 monthlychargesi + β6 contract2i
σ
+
β7 contract3i
σ
)
where the last equality substitutes for paperlessi = 1 or paperlessi = 0.
The average marginal effect, calculated using the margins package is presented
in Figure 6. The average marginal effect for paperless is 0.0673—relative
to customers who only receive paper bills, customers who receive paperless
bills have a probability of switching providers (churn) that is higher by 6.73
probability points. This is somewhat close to the estimated marginal effect for
the linear probability model of 0.0717.
vi) Calculate the marginal effect for paperless bills for a customer with a partner,
but no dependents, with a tenure of 30 months, with monthly charges of $70
per month, and on a month-to-month contract.
15
Untitled
# short way
> summary(margins(probit,variables="paperless",
+ at=list(partner =1, dependents = 0,
+ tenure = 30, monthlycharges = 70, contract2 = 0,
+ contract3 = 0)))
     factor partner dependents  tenure monthlycharges contract2 contract3
1.0000     0.0000 30.0000        70.0000    0.0000    0.0000
AME     SE      z      p  lower  upper
paperles 0.0942 0.0138 6.8396 0.0000 0.0672 0.1212
Page 1
Now evaluate the marginal effect provided in part (c)(iv) at partner=1,
dependents=0, tenure = 30, monthlycharges = 70, contract2 = 0, contract3=0,
and paperless = 1.
∂Pr(churni = 1|Xi)
∂paperlessi
= Φ
(
β0
σ
+
β1
σ
+
{
30 ∗ β3
σ
}
+
{
900 ∗ β4
σ
}
+
{
70 ∗ β5
σ
}
+
β8
σ
)
− Φ
(
β0
σ
+
β1
σ
+
{
30 ∗ β3
σ
}
+
{
900 ∗ β4
σ
}
+
{
70 ∗ β4
σ
})
Recall the Probit parameter estimates are provided in ratio form (scaled by
the error standard deviation variance σ):
b0 =
(̂
β0
σ
)
b1 =
(̂
β1
σ
)
b2 =
(̂
β2
σ
)
b3 =
(̂
β3
σ
)
b4 =
(̂
β4
σ
)
b5 =
(̂
β5
σ
)
b6 =
(̂
β6
σ
)
b7 =
(̂
β7
σ
)
b8 =
(̂
β8
σ
)
Substituting for the Probit estimates, the estimated marginal effect is
∂Pˆr(churni = 1|Xi)
∂paperlessi
= Φ (b0 + b1 + [30 ∗ b3] + [900 ∗ b4] + [70 ∗ b4] + b8)
− Φ (b0 + b1 + [30 ∗ b3] + [900 ∗ b4] + [70 ∗ b4])
= Φ(−0.431167)− Φ(−0.7096182)
= 0.3331735− 0.2389705
= 0.09420303
16