MS5217 Statistical Data Analysis
Individual Take-Home Final Project
Prof. Gavin Feng
Due at Oct 19th 11:59 pm
Guidance
1. Please upload your pdf submission to Cavnas before Oct 19th 11:59 pm.
2. Please have your name, student number, and CityU email on top of the first page.
3. If you have any general question, please post it on Canvas discussion before Oct 17th 11:59 pm.
Project Question (100 points)
Suppose you are a quantitative consultant specializing in mutual fund performance evaluation. You now have
a task to introduce a CAPM-based assessment to identify the five best and worst funds out of 200. We have
provided the net return (removing fee) data of these 200 funds for the previous ten years, as well as the
excess market return.
The goal is to advise clients to select (or avoid) the best (or worst) five funds for the future ten years (hold-out
data). The clients appreciate the market-adjusted performance evaluation, Jensen’s alpha. Therefore, we
are not looking for total performance but market-adjusted performance. The clients also emphasize the
asymmetric relationship between the fund return and market return and want to see the market beta
differences between bull and bear markets.
You will use the “fund_sel.R” file to select 50 funds for your study, so everyone has a different sample and
select for different funds :)
By recommending the five best and five worst funds for market-adjusted performance, you need to provide
statistical and regression analysis for their equally-weighted 5-fund best (or worst) portfolio. [Hint: An
equally weighted portfolio for five funds has the return 0.2 ? r1,t + 0.2 ? r2,t + 0.2 ? r3,t + 0.2 ? r4,t + 0.2 ? r5,t.]
1. Report Writing (30 points)
You are going to prepare an analysis report to the clients, most of whom only expect to see simple statistical
analysis with figures and tables without any coding. Most importantly, the report should be complete and
professional.
? In your report, please includes useful figures and tables for the empirical illustration.
? Please limit your report in 6 pages.
? Please do not show us your codes and unprocessed R results. Please be professional!
? You can use any software to prepare for the report. If you are really good in Excel, I don’t mind.
2. Empirical Analysis (50 points)
You are suggested to justify your best and worst fund recommendation by following the below question map.
1
(a) What are the details for your fund selection implementation?
(b) Why do you choose these ten funds for best or worst market-adjusted performance? Any summary
statistics (tables or figures) for evidences?
(c) Could you consider to create a dummy variable for the bull and bear market? For example, you can
consider create a variable Bull(t) = Mkt(t-1) > 0. When the lag market return is positive, we define it
as the bull market.
ri,t = αi + β1 ?mktt + β2 ?Bull(t) ?mktt + i,t
Given the selected best five funds, we want to evaluate the performance of the equally-weighted 5-fund
portfolio. [Hint: you can calculate the best 5-fund portfolios and then run your models on these portfolio
returns. ]
(d) We want to confirm the asymmetric effect on CAPM beta. Could you provide 95% confidence intervals
for the bull-bear beta difference of your best 5-fund portfolios?
(e) Is there any bull-bear asymmetric difference for alphas of your best 5-fund portfolios? Could we use the
above regression to solve this problem? Could you test it at the 5% level separately?
3. Portfolio Performance (20 points)
The last part of your grade is based on the class ranking of your portfolio performance in the hold-out data.
Remember, the best historical performance does not gurantee for any good future performance. Investing is
ruthless and winner takes all. In this part, the best performance gets 100% and the worst 0%.
(f) the asymmetric CAPM alpha – we are going to use the above asymmetric CAPM for this evaluation to
reflect the bull and bear market.
We will calculate this alpha for your best 5-fund and worst 5-fund portfolios in the hold-out data. For the
former one, we expect for high values. While for the latter one, we expect for low values.
Result Submission
In addition to the report, you need to submit your five fund choices for the hold-out data contains the
simulated future 10 years of observations. Only I have the data :)
Specifically, you need to upload a separate “csv” file with the exact file name “MS5217_studentid”, where
“studentid” should be replaced with your CityU student number. If one’s id is 12345678, the file name is
“MS5217_12345678.csv”. The file contains 5 rows and 2 columns, where corresponds to the best and worst
fund IDs. Before you submit the “csv” file, please run the program “check_csv.R” by yourself.
Please be careful. Any difference in format names causes 10 points for penalties. As long as your output file
can pass the program “check_csv.R”, it should work.
2