程序代写案例-IMBA 214-50
时间:2021-06-20
IMBA 214-50
Project 2
Mountain Bike Sales
Instructor:
Sudhir Thakur
By:
Vasundhara Sharma







Introduction:
This documented analysis aims to study relation and influence of various factors on Sales of mountain
bikes. Mountain bikes are special purpose-built bikes which are generally used for mountain trails and
unpaved surfaces. They are also commonly used on paved way in cities and town due to their easy
customization. It is noticed in all major cities of the world that more bicycles are being used now than
automobiles due to parking issues, traffic and congestion in the cities. These bikes are also commonly
used as a mode of exercise within all age groups and can be also found commonly in touristy areas. Many
people are also turning towards bike riding to reduce the carbon footprints generated by the usage of their
automobiles.
The reason of selecting mountain bike sales for analysis is that, as per my understanding usage of bicycles
is seeing an upward trend and one of the major factors which impacts the sale of the bike is that of
population density. Here as part of this analysis, we will be running multiple regression model to come to
conclusion if population density along with other factors have an impact on sales of the bikes.

Literature Review:
As part of the study, we are trying to analyze the impact of Sales on various factors such as Floorspace,
Advertisement related to the bike, Population Density in the area , Competing stores near and around the
area, Pricing and Brand of the bike(whether the brand influence its sales or not). According to article in
inc.com “How the Humble Bicycle Spurred a Modern Lifestyle Industry” the demand for market bike has
increased over years due to increase in health-related sport activities, rise in disposable income,
accessibility due to bike sharing apps etc. Urbanization has also played an important part as people living
in the cities find bike as an affordable and convenient alternative while travelling on congested city roads.
City planner and civil Corporations also encourage use of bikes and it assist in maintenance of traffic,
resolve parking issues and keep pollution level in cities at bay. The article also states that brand of the
bike may not be a deciding factor for consumers while purchasing the bike as they look out for more
variety and customization which may not be prevalent in all brands. The same trend can also be explained
in the article “How the humble bicycle is making a comeback in US cities” which states the upward trend
in cycle usage specially in the city areas due to its convenience in terms of commuting.
Problem Statement:
The main purpose of running these models is to check on the impact of sales of mountain bikes by various
factors such as Price, Floor space that the bikes takes, its brand, competing stores, competing ads etc. The
purpose of analysis is also to check specifically if the sales of these bikes increases in area with high
density population like touristy or urban areas. The analysis will be done based on 30 observations
collected from Canvas(as provided by instructor).
Null hypothesis in this case will implicate that no relation exists between Sales and above stated various
factors and Alternate hypothesis would show that significant relation exists between Sales and stated
factors.
Methodology:
A data set consisting of 30 observations have been taken from Canvas. This data set will be analyzed in
JMP software by running regression models. Two models will be generated in JMP. One model with 1
dependent (Sales) and 5 independent variables (Floor space, Competing Ads, Population Density,
Competing stores and Prices) and second model which would consist of an additional dummy
variable(Brand) along with 1 dependent and 5 independent variables(stated above).
Below are the 2 regression models generated in JMP:


Model 1: Simple Regression Model Without Dummies
Number of Observations model is based on: 30
Dependent Variables:
• Sales
Independent Variables
• Floor Space
• Competing Ads
• Population Density
• Competing Stores
• Price
Equation of the model:
Sales = 1103.2062 + 11.105865*(Floor space) + (-6.409642) (Competing Ads) + 0.0600383*(Pop
Density per sqkm) + (-0.69955)*Competing Stores + (-0.145987)*Price
R squared value: 0.808526
Root mean Square Error 112.1223
Intercept Value: 1103.2062

Below are screenshots from JMP:

Summary table For simple model
Attribute of
Model
Value Interpretation P-value Interpretation
Intercept 1103.2062 The intercept of 1103.2062
indicates that if all other factors
are zero, then the sales of the
bike will be equivalent to
1103.2062
0.0073 The value of 0.0073 signifies that
P> 0.05 and that we reject the null
hypothesis.
R Squared 80.8526% r Squared value of 80.852 predicts
that 80.85% of data fits the model
signifying 80.85 variability in Sales
due to independent variables.

B1(Slope of
Floor Space)

11.105865 This value signifies that if Slope of
Floor space increases by 1, then
Sales will increase by 11.10
<.0001 This value interprets that we
accept the alternate hypothesis
and reject the null signifying
significant relation between Sales
of bike and floor space
B2(Slope of
Competing
adds)

-6.409542 This value signifies that if slope
increases by 1, then Sales will
decrease by 6.409542
0.0912 This value signifies that we accept
the null and that there is no
significant relation between
Competing adds and Sales of the
bike
B3(Population
Density)
0.0600383 This value signifies that if slope
increases by 1, then Sales will
increase by 0.0600383
0.0205 This value signifies that we accept
the alternate hypothesis since
p<0.05 and that there is significant
relation between Population
Density and bike sales.
B4(Competing
Stores)
-0.69955 This value signifies that if slope of
completing store increases by 1,
then Sales decrease by 0.69955
0.9510 This value signifies that we accept
the null hypothesis since P>0.05
and that there is no significant
relation between Sales and
Competing Stores
B5(Price) -0.145987 This value signifies that if slope of
Price increases by 1, then Sales
decreases by 0.145987
0.0968 The value of P =0.0968 signifies
that we accept the null and reject
the alternate hypothesis signifying
no significant relation between
Price and Sales.

Model 2: Modified Model with Dummy Variable
Number of Observations model is based on 30
Dependent Variables:
• Sales
Independent Variables:
• Floor Space
• Competing Ads
• Population Density
• Competing Stores
• Price
• Dummy Brand (0= Not a brand,1=Expensive brand)

Equation of the model:
Sales = 1183.2146 + 10.930149*(Floor space) + (-7.078638) (Competing Ads) + 0.0544797*(Pop
Density psqkm) + (-4.233147) *Competing Stores + (-0.139374) *Price + 45.885272(Brand Dummy)
R squared value:0.813147
Root mean Square Error 113.1433
Intercept Value: 1183.2146

Summary table - For modified Model
Attribute of
Model
Value Interpretation P-
value
Interpretation
Intercept 1183.2146 The intercept of 1183.2146 indicates
that if all other factors are zero, then
the sales of the bike will be
equivalent to 1183.2146
0.0063 The value of 0.0063 signifies
that P> 0.05 and that we reject
the null hypothesis.
R Squared 81.3147% R Squared value of 81.3147 predicts
that 81.31% of data fits the model

signifying 81.3147 variability Sales
due to independent variables.
B1(Slope of
Floor Space)

10.930149 This value signifies that if Slope of
Floor space increases by 1, then
Sales will increase by 10.930149
<.0001 This value interprets that we
accept the alternate hypothesis
and reject the null signifying
relation between Sales of bike
and floor space
B2(Slope of
Competing
adds)

-7.078638 This value signifies that if slope
increases by 1, then Sales will
decrease by -7.078638
0.0740 This value signifies that we
accept the null and that there is
no significant relation between
Competing adds and Sales of the
bike
B3(Population
Density)
0.0544797 This value signifies that if slope
increases by 1, then Sales will
increase by 0.0544797
0.0436 This value signifies that we
accept the alternate hypothesis
since p<0.05 and that there is
significant relation between
Population Density and bike
sales.
B4(Competing
Stores)
-4.233147 This value signifies that if slope of
completing store increases by 1,
then Sales decrease by 4.233147
0.7338 This value signifies that we
accept the null hypothesis since
P>0.05 and that there is no
significant relation between
Sales and Competing Stores
B5(Price) -0.139374 This value signifies that if slope of
Price increases by 1, then Sales
decreases by 0.139374
0.1175 The value of P =0.1175 signifies
that we accept the null and
reject the alternate hypothesis
signifying no significant relation
between Price and Sales.
B6(Brand
Dummy)
45.885272 This slope values
indicates that
Sales
will be increased by
45.885272 for a bike which is an
expensive brand compared to no
brand.
0.4584 For the value of P=0.4584, we
accept the null hypothesis (as
P>0.05) and reject the alternate.
This signifies that there is no
significant relation between
Brand and Sales of the bike.


Conclusion:
From the above 2 models run, it can be noticed that not all factors taken into consideration have a
significant impact on the sales of the mountain bikes. For both the models, with and without dummy, it
could be noticed that floor space taken by the bikes and Population density have a P value< 0.05. For
these factors, we reject the null hypothesis, accept the alternate. Hence, we conclude that population
density and floor space have a significant impact in the sales of the bike.
The P value for Competing Adds, Population Density, competing stores and Price have P value > 0.05
which signifies that we accept the null hypothesis and conclude that none of these factors are significant
when it comes to sales of the bike. The dummy variable added in the modified model also has a P
value(P=0.4584) also signifying that the brand does not have any significant impact on the sales of the
bike(as P >0.05. We accept the null hypothesis and reject the alternate hypothesis). It can be noticed
while comparing the 2 models that the R squared value is higher and closer to 1 when adding the brand
dummy to the data. A slight increase in the can also be noticed in the intercept when we add the
dummy brand variable into our model.

Further Analysis
1) Multicollinearity problem
Below multicollinearity test has been run on modified model.
Multicollinearity occurs when independent variables are corelated which in a way is a cause of concern
as it hinders accurate results when we fit the model. All independent variables in the model should be
independent of each other.
VIF or variation inflation factor will be used to interpret multicollinearity. VIF assists in measuring how
much variance of an estimated regression coefficient increases if independent variables are correlated.
A high VIF signifies that associated independent variable is highly collinear. Values of VIF > 10 are usually
said to be multicollinear. Since none of the VIF values of independent variable exceeds 10, none of the
independent variables are corelated.
Below is multicollinearity on simple regression model (without Dummy parameter) which also depicts
that none of the independent variables are correlated.



2) Autocorrelation
Autocorrelation is usually referred to as Lag correlation as it measures relationship between variables
current values and its past values.
Durbin Watson test is used to determine if there exist autocorrelation in the data set or not. The null
hypothesis in this case signifies that there is no correlation and that the residuals are independent. The
alternative Hypothesis states that residuals are corelated.
On Simple Model (without Dummy):
The value closer to 2, signifies no autocorrelation. The value of Durbin-Watson test of 1.833 signifies
slightly positive correlation (Autocorrelation value of 0.0507 also signify positive correlation). It is to be
noticed that P value in this case is 0.2648 and is not significant (as P>0.05). This implies that we accept
the null that there is no correlation and residuals are independent. This also signifies no first order
positive correlation.

On Modified Model (With Dummy)
The value of Durbin-Watson test of 1.833 signifies slightly positive correlation. It is to be noticed that P
value in this case is 0.2648 and is not significant (as P>0.05). This implies that we accept the null that
there is no correlation and residuals are independent. This also signifies no first order positive
correlation.



References:
Kenny Kline (2017, Feb). How the Humble Bicycle Spurred a Modern Lifestyle Industry. Retrieved from
https://www.inc.com/kenny-kline/5-trends-that-paved-the-way-for-a-bicycle-industry-
renaissance.html
(2016, July). How the humble bicycle is making a comeback in US cities. Retrieved from
https://www.bbc.com/news/world-us-canada-36778953
Mountain Bike Market Size Worth $3,585 Million By 2026. Retrieved from
https://www.polarismarketresearch.com/press-releases/global-mountain-bike-market

bike_sales_data_proje
ct2.xlsx





























































































































































































































































































































学霸联盟


essay、essay代写