ECOM 2001-ECOM2001 程序代写案例
时间:2021-10-05
ECOM 2001 Term Project: (KO RAD CWT) LIUHAORAN 20750822 Due Tuesday, 5th October 2021 16:00 AWST # packages library(tidyquant) # for importing stock data library(tidyverse) # for working with data # library(broom) # for tidying output from various statistical procedures library(knitr) # for tables # library(kableExtra) # for improving the appearance of tables # Add any additional packages that you use to this code chunk 1 Import the Data (2 points) ## 1) Import your assigned stocks ## Use the package tidyquant. You may need to install this package first. ## Replace Stock1, Stock2, Stock3 with your assigned stock names (in quotation marks), uncomment the code, and Run stocks<-c("KO", "RAD","CWT") %>% tq_get(get = "stock.prices", from = "2000-01-01")%>% select(symbol, date, adjusted) ## This is your data set for this project (rename yourDataName to something more descriptive) ## output the first 6 rows of your data frame: head(stocks, n = 6 )%>%mutate_if(is.numeric, round, digits=3)%>% kable(caption = "Three Stocks") Table 1: Three Stocks symbol date adjusted KO 2000-01-03 15.572 KO 2000-01-04 15.590 KO 2000-01-05 15.728 KO 2000-01-06 15.745 KO 2000-01-07 16.781 KO 2000-01-10 16.246 1 2 The Analysis 2.1 Plot prices over time (3 points) Plot the prices of each asset over time separately. Succinctly describe in words the evolution of each asset over time. (limit: 100 words for each time series). ## Don't forget to add fig.cap= "Your caption" to the code chunk header. ## facet_wrap() may be useful ggplot(stocks, aes(date, adjusted))+ geom_line()+ facet_wrap(~symbol) CWT KO RAD 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 0 50 100 150 200 date a dju ste d Figure 1: prices over time CWT asset price shows a uptrend over time. Especially after year 2015 its increase speeds up. KO asset price showed no trend in general before year 2010 because but it shows a uptrend over time after 2010. Both CWT and KO assets had seen a drop in year 2020. RAD asset price shows no trend over time, it is purely random. It increased in some days, but then could decreased. It is hard to find a clear pattern of RAD asset. 2 2.2 Calculate returns and plot returns over time (4 points) Calculate the daily percentage returns of each asset using the following formula: rt = 100 ∗ ln ( Pt Pt−1 ) Where Pt is the asset price at time t. Then plot the returns for each asset over time. ## Hint: you need to add a column to your data frame (yourDataName). ## You can use the mutate() function ## Don't forget to group_by() ## The lag() function can be used to find the price in the previous date ## Double check your results!! stocks <- stocks%>% group_by(symbol)%>% mutate(return = 100*log(adjusted/lag(adjusted))) ggplot(stocks, aes(date, return))+ geom_line()+ facet_wrap(~symbol) 3 CWT KO RAD 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 2000 2005 2010 2015 2020 −40 −20 0 20 date re tu rn Figure 2: returns over time 2.3 Histogram of returns (4 points) Create a histogram for each of the returns series (explain how you determined the number of bins to use). ggplot(stocks, aes(return))+ geom_histogram()+ facet_wrap(~symbol) 4 CWT KO RAD −40 −20 0 20 40−40 −20 0 20 40−40 −20 0 20 40 0 1000 2000 3000 4000 return co u n t Figure 3: return histogram I use the default setting of bins, that is, 30 bins starting at -40 and end at 40. 2.4 Summary table of returns (4 points) Report the descriptive statistics in a single table which includes the mean, median, variance, standard deviation, skewness and kurtosis for each series. What conclusions can you draw from these descriptive statistics? ## Your summary table here. Be sure to format the table appropriately. stocks%>%drop_na()%>% group_by(symbol)%>% summarise(mean = mean(return),median = median(return),variance = var(return), `standard deviation`=sd(return), skewness = skewness(return), kurtosis = kurtosis(return))%>% mutate_if(is.numeric, round, digits=3)%>% kable(caption = "Summary table of returns") conclusion: the mean return of CWT and KO is larger than RAD. Moreover, the variance of RAD return is much greater than that of CWT and KO. 5 Table 2: Summary table of returns symbol mean median variance standard deviation skewness kurtosis CWT 0.038 0.069 3.337 1.827 0.315 11.429 KO 0.022 0.041 1.768 1.330 -0.168 9.284 RAD -0.052 0.000 19.024 4.362 -0.029 12.033 2.5 Are average returns significantly different from zero? (5 points) Under the assumption that the returns of each asset are drawn from an independently and identically distributed normal distribution, are the expected returns of each asset statistically different from zero at the 1% level of significance? Provide details for all 5 steps to conduct a hypothesis test, including the equation for the test statistic. Calculate and report all the relevant values for your conclusion and be sure to provide an interpretation of the results. Steps 1. The null and alternative hypothesis H0 : µ = 0 H1 : µ 6= 0 2. The level of significance and number of observations. Let’s use α = 0.01. 3. The test statistic. We do not know the true population standard deviation. So we will use a t-test statistic. The t-test statistic is t = m s/ √ n where m is the mean, s is the standard deviation and n is the sample size. 4. The critical values for our test statistic. 5. The decision. If the test statistic falls into either rejection region (so t is less than the lower cutoff value or greater than the upper cutoff value), reject the null. Also reject the null if the p-value of the test is less than the significance level we chose (α) (this is the more direct way to make a decision:both methods lead to the same result). ## Hint: you can extract specific values from t.test objects using the $ ## Eg. using t.test(x,y)$statistic will extract the value of the test statistic. ## Consult the help file for the other values generated by the t.test() function. ## The relevant values are: the t-test method, the estimated mean , the test statistic, whether the test is one or two tailed, the degrees of freedom, and the p-value. (You might wish to present this in a table) ttesttable <- data.frame(t.test.method = rep("one sample t test (two-tailed)",3), `estimat mean` = c( t.test(stocks$return[stocks$symbol=="CWT"])$estimate, t.test(stocks$return[stocks$symbol=="KO"])$estimate, t.test(stocks$return[stocks$symbol=="RAD"])$estimate), `test statistic`=c( t.test(stocks$return[stocks$symbol=="CWT"])$statistic, t.test(stocks$return[stocks$symbol=="KO"])$statistic, t.test(stocks$return[stocks$symbol=="RAD"])$statistic), df = c( t.test(stocks$return[stocks$symbol=="CWT"])$parameter, t.test(stocks$return[stocks$symbol=="KO"])$parameter, 6 Table 3: t test table of returns t.test.method estimat.mean test.statistic df p.value one sample t test (two-tailed) 0.038 1.552 5471 0.121 one sample t test (two-tailed) 0.022 1.246 5471 0.213 one sample t test (two-tailed) -0.052 -0.877 5471 0.380 t.test(stocks$return[stocks$symbol=="RAD"])$parameter), p.value = c( t.test(stocks$return[stocks$symbol=="CWT"])$p.value, t.test(stocks$return[stocks$symbol=="KO"])$p.value, t.test(stocks$return[stocks$symbol=="RAD"])$p.value) ) ttesttable%>%mutate_if(is.numeric, round, digits=3)%>% kable(caption = "t test table of returns") The results show that the average returns is not significantly different from zero because p-values are all greater than 0.01. 2.6 Are average returns different from each other? (6 points) Assume the returns of each asset are independent from each other. With this assumption, are the mean returns statistically different from each other at the 1% level of significance? Provide details for all 5 steps to conduct each of the hypothesis tests using what your have learned in the unit. Calculate and report all the relevant values for your conclusion and be sure to provide and interpretation of the results. (Hint: You need to discuss the equality of variances to determine which type of test to use.) The testing procedure 1. The null hypothesis is that all means are equal. The alternative is that at least one mean is not equal. 2. Again, use a level of significance of 0.01. 3. The test statistic is an F statistic: F = MSB MSW ∼ Fc−1,n−c Where MSB is the mean square between groups and MSW is the mean square within groups. 4. The critical values are from an F distribution with c-1 degrees of freedom in the numerator and n-c degrees of freedom in the denominator. This is a one-tailed test so we place all α = 0.01 in the upper tail. 5. The decision and interpretation. Before implementing the test, we should test the equality of variance. We can test for homogeneity of variance using Levene’s test. ## Decide on which test is appropriate for testing differences in mean returns ## Hint: Include the results of your supporting test for the differences in variances (include all 5 hypothesis step tests and the equation for the test statistics, and a clear interpretation of the result). ## Hint: http://www.sthda.com/english/wiki/one-way-anova-test-in-r ## So this section has (at least) 2 significance tests. qf(0.99, 3-1, 5472 - 3) ## [1] 4.60905 7 Table 4: Levene’s test Df F value Pr(>F) group 2 1098.819 0 16413 NA NA Table 5: One-way analysis of means num.df den.df statistic p.value method 2 9793.712 0.996 0.369 One-way analysis of means (not assuming equal variances) car::leveneTest(return ~ symbol, data = stocks)%>% mutate_if(is.numeric, round, digits=3)%>% knitr::kable(caption = "Levene’s test") The F statistic for the test is 1098.8 which is larger than our critical value of 4.609. The p-value of the test is < 2.2e-16 which is less than our level of significance of 0.01. So we can reject the null hypothesis. This means we can assume that the return variances are unequal across the three stocks. This result informs our selection of the one-way ANOVA to test for differences in mean return. We can use the test that assumes that the samples are drawn from populations which have unequal variances. oneway.test(return ~ symbol, data = stocks, var.equal = F)%>% broom::tidy()%>%mutate_if(is.numeric, round, digits=3)%>% knitr::kable(caption = "One-way analysis of means") We can see that the F statistic is 0.9960773 which is less than our critical value of 4.609. The p-value of the test is 0.3693628 which is larger than our level of significance of 0.01. So we cannot reject the null hypothesis. We can conclude that the three stocks have a equal average return. 2.7 Correlations (2 points) Calculate and present the correlation matrix of the returns. Discuss the direction and strength of the correlations. ## Include a formatted correlation matrix here ## Hint: http://www.sthda.com/english/wiki/correlation-matrix-a-quick-start-guide-to-analyze-format-and-visualize-a-correlation-matrix-using-r-software correlationmat<-cor(data.frame(CWT = stocks$return[stocks$symbol=="CWT"], KO = stocks$return[stocks$symbol=="KO"], RAD = stocks$return[stocks$symbol=="RAD"]), use = "complete.obs") kable(round(correlationmat,3),caption = "correlation matrix") The correlation between the three stocks is weak. Table 6: correlation matrix CWT KO RAD CWT 1.000 0.328 0.178 KO 0.328 1.000 0.156 RAD 0.178 0.156 1.000 8 Table 7: orrelation test row column cor p CWT KO 0.3281730 0 CWT RAD 0.1781285 0 KO RAD 0.1557560 0 2.8 Testing the significance of correlations (2 points) Is the assumption of independence of stock returns realistic? Provide evidence (the hypothesis test including all 5 steps of the hypothesis test and the equation for the test statistic) and a rationale to support your conclusion. The testing procedure 1. The null hypothesis is that all stocks are independent. The alternative is that at least two stocks are correlated. 2. Again, use a level of significance of 0.01. 3. The test statistic is an t statistic: t = r√ 1− r2 ∼ tn−2 Where r is the correlation, n is the number of observation in x and y variables. 4. The critical values are from a t distribution with n-2 degrees of freedom. This is a two-tailed test. 5. The decision and interpretation. ## Report the results of tests for statistical significance of the correlations here. ## Hint: http://www.sthda.com/english/wiki/correlation-matrix-a-quick-start-guide-to-analyze-format-and-visualize-a-correlation-matrix-using-r-software # ++++++++++++++++++++++++++++ # flattenCorrMatrix # ++++++++++++++++++++++++++++ # cormat : matrix of the correlation coefficients # pmat : matrix of the correlation p-values flattenCorrMatrix <- function(cormat, pmat) { ut <- upper.tri(cormat) data.frame( row = rownames(cormat)[row(cormat)[ut]], column = rownames(cormat)[col(cormat)[ut]], cor =(cormat)[ut], p = pmat[ut] ) } res2<-Hmisc::rcorr(as.matrix(data.frame(CWT = stocks$return[stocks$symbol=="CWT"], KO = stocks$return[stocks$symbol=="KO"], RAD = stocks$return[stocks$symbol=="RAD"]))) kable(flattenCorrMatrix(res2$r, res2$P),caption = "orrelation test") Since p-values are 0, we can reject the null hypothesis and conclude that the the assumption of independence of stock returns is not realistic. 9 2.9 Advising an investor (12 points) Suppose that an investor has asked you to assist them in choosing two of these three stocks to include in their portfolio. The portfolio is defined by r = w1r1 + w2r2 Where r1 and r2 represent the returns from the first and second stock, respectively, and w1 and w2 represent the proportion of the investment placed in each stock. The entire investment is allocated between the two stocks, so w + 1 + w2 = 1. The investor favours the combination of stocks that provides the highest return, but dislikes risk. Thus the investor’s happiness is a function of the portfolio, r: h(r) = E(r)− Var(r) Where E(r) is the expected return of the portfolio, and Var(r) is the variance of the portfolio.1 Given your values for E(r1), E(r2), Var(r1), Var(r2) and Cov(r1, r2) which portfolio would you recommend to the investor? What is the expected return to this portfolio? Provide evidence to support your answer, including all the steps undertaken to arrive at the result. (*Hint: review your notes from tutorial 6 on portfolio optimisation. A complete answer will include the optimal weights for each possible portfolio (pair of stocks) and the expected return for each of these portfolios.) Portfolio 1: CWT and KO let r1 be CWT return and r2 be KO return, then we have E(r1) = 0.038, E(r2) = 0.022 and V ar(r1) = 3.337, V ar(r2) = 1.768 according to the result got above. cov(r1, r2) = 0.797 Choose the optimal w1 and w2 y = E(r)−V ar(r) = w1E(r1)+w2E(r2)−w21V ar(r1)−w22V ar(r2)−2w1w2Cov(r1, r2) = w10.038+w20.022− w213.337− w221.768− 2w1w20.797 Since w2 = 1−w1, then y = −3.511w21 +1.76w1− 1.746. According to the property of quadratic function, we know when w1 = −1.76/(−2 ∗ 3.511) = 0.251, y is the maximum. Thus, optimal w1 = 0.251, w2 = 1− 0.251 = 0.749 The expected return of this portfolio is E(r) = 0.251 ∗ 0.038 + 0.749 ∗ 0.022 = 0.026 Portfolio 2: CWT and RAD let r1 be CWT return and r2 be RAD return, then we have E(r1) = 0.038, E(r2) = −0.052 and V ar(r1) = 3.337, V ar(r2) = 19.024 according to the result got above. cov(r1, r2) = 1.419 Choose the optimal w1 and w2 y = E(r)−V ar(r) = w1E(r1)+w2E(r2)−w21V ar(r1)−w22V ar(r2)−2w1w2Cov(r1, r2) = w10.038−w20.052− w213.337− w2219.024− 2w1w21.419 Since w2 = 1− w1, then y = −19.556w21 + 35.3w1 − 19.076. According to the property of quadratic function, we know when w1 = −35.3/(−2 ∗ 19.556) = 0.903, y is the maximum. Thus, optimal w1 = 0.903, w2 = 1− 0.903 = 0.097 The expected return of this portfolio is E(r) = 0.903 ∗ 0.038− 0.097 ∗ 0.052 = 0.029 Portfolio 3: KO and RAD let r1 be KO return and r2 be RAD return, then we have E(r1) = 0.022, E(r2) = −0.052 and V ar(r1) = 1.768, V ar(r2) = 19.024 according to the result got above. cov(r1, r2) = 0.903 1Note that E(r) = w1E(r1) + w2E(r2), and Var(r) = w21Var(r1) + w22Var(r2) + 2w1w2Cov(r1, r2) 10 Table 8: Best Portfolio Stock Returns Stock Returns Variances Weights Return*Weight CWT 0.038 3.337 0.903 0.034314 RAD -0.052 19.024 0.097 -0.005044 Choose the optimal w1 and w2 y = E(r)−V ar(r) = w1E(r1)+w2E(r2)−w21V ar(r1)−w22V ar(r2)−2w1w2Cov(r1, r2) = w10.022−w20.052− w211.768− w2219.024− 2w1w20.903 Since w2 = 1−w1, then y = −18.986w21+36.316w1−19.076. According to the property of quadratic function, we know when w1 = −36.316/(−2 ∗ 18.986) = 0.956, y is the maximum. Thus, optimal w1 = 0.956, w2 = 1− 0.956 = 0.044 The expected return of this portfolio is E(r) = 0.956 ∗ 0.022− 0.044 ∗ 0.052 = 0.019 The best portfolio is CWT and RAD. Its expected return is showed in below table. cov(stocks$return[stocks$symbol=="CWT"], stocks$return[stocks$symbol=="KO"],use = "complete.obs") ## [1] 0.7970251 cov(stocks$return[stocks$symbol=="CWT"], stocks$return[stocks$symbol=="RAD"],use = "complete.obs") ## [1] 1.419271 cov(stocks$return[stocks$symbol=="KO"], stocks$return[stocks$symbol=="RAD"],use = "complete.obs") ## [1] 0.9032068 # You can use this section to create a table of your results. tibble("Stock" = c("CWT", "RAD"), "Returns" = c(0.038, -0.052), "Variances" = c(3.337,19.024), "Weights" = c(0.903,0.097), "Return*Weight" = Returns*Weights)%>% kable(caption = "Best Portfolio Stock Returns", label = "stocks") 2.10 The impact of financial events on returns (6 points) Two significant financial events have occurred in recent history. On September 15, 2008 Lehman Brothers declared bankruptcy and a Global Financial Crisis started. On March 11, 2020 the WHO declared COVID-19 a pandemic. Use linear regression to determine if a. Any of the stocks in your data exhibit positive returns over time. b. Either of the two events had a significant impact on returns. Report the regression output for each stock and interpret the results to address these two questions. How would you interpret this information in the context of your chosen portfolio? ## Add a column to your returns data set. ## This is a factor variable with three levels: ## 'Lehman Bankruptcy' for the date 2008-09-15, ## 'Pandemic' for the date 2020-03-11, and 11 ## 'BAU' (Business as usual) for all other dates. finacialevent <- rep("BAU",nrow(stocks)) finacialevent[stocks$date=="2008-09-15"] = 'Lehman Bankruptcy' finacialevent[stocks$date=="2020-03-11"] = 'Pandemic' stocks = cbind(stocks,data.frame(finacialevent)) ## Then run a regression analysis to determine whether returns to each stock are increasing over time and if the events had and statistically significant impact on the returns of each stock. CWT = stocks[stocks$symbol=="CWT",]%>%drop_na() KO = stocks[stocks$symbol=="KO",]%>%drop_na() RAD = stocks[stocks$symbol=="RAD",]%>%drop_na() CWT.lm <- lm(return~date+finacialevent,CWT) summary(CWT.lm) ## ## Call: ## lm(formula = return ~ date + finacialevent, data = CWT) ## ## Residuals: ## Min 1Q Median 3Q Max ## -12.3224 -0.9246 0.0246 0.9586 25.6784 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -6.051e-02 1.625e-01 -0.372 0.710 ## date 6.743e-06 1.076e-05 0.627 0.531 ## finacialeventLehman Bankruptcy -2.253e+00 1.824e+00 -1.235 0.217 ## finacialeventPandemic -7.945e+00 1.824e+00 -4.355 1.35e-05 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 1.824 on 5468 degrees of freedom ## Multiple R-squared: 0.003787, Adjusted R-squared: 0.003241 ## F-statistic: 6.93 on 3 and 5468 DF, p-value: 0.0001186 KO.lm <- lm(return~date+finacialevent,KO) summary(KO.lm) ## ## Call: ## lm(formula = return ~ date + finacialevent, data = KO) ## ## Residuals: ## Min 1Q Median 3Q Max ## -10.6083 -0.5715 0.0139 0.6057 12.9790 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -6.969e-02 1.184e-01 -0.588 0.5563 ## date 6.195e-06 7.840e-06 0.790 0.4294 ## finacialeventLehman Bankruptcy 4.397e-01 1.329e+00 0.331 0.7408 ## finacialeventPandemic -2.783e+00 1.330e+00 -2.093 0.0364 * ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## 12 ## Residual standard error: 1.329 on 5468 degrees of freedom ## Multiple R-squared: 0.0009225, Adjusted R-squared: 0.0003744 ## F-statistic: 1.683 on 3 and 5468 DF, p-value: 0.1684 RAD.lm <- lm(return~date+finacialevent,RAD) summary(RAD.lm) ## ## Call: ## lm(formula = return ~ date + finacialevent, data = RAD) ## ## Residuals: ## Min 1Q Median 3Q Max ## -38.210 -1.802 0.037 1.768 35.524 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -1.393e-01 3.882e-01 -0.359 0.719721 ## date 6.052e-06 2.570e-05 0.236 0.813818 ## finacialeventLehman Bankruptcy 1.059e+00 4.358e+00 0.243 0.808035 ## finacialeventPandemic -1.638e+01 4.358e+00 -3.759 0.000173 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 4.357 on 5468 degrees of freedom ## Multiple R-squared: 0.002593, Adjusted R-squared: 0.002045 ## F-statistic: 4.738 on 3 and 5468 DF, p-value: 0.00265 Since the three coefficients of date are larger than 0, we can conclude that the three stocks exhibit positive returns over time. Since only the p-value of coefficient of Pandemic financial event is less than 0.05, which indicate we can reject the null hypothesis and conclude that the Pandemic event had a significant impact on returns. 13


essay、essay代写