IEOR 4150-Python代写
时间:2022-12-06
IEOR 4150 Introduction to Probability and Statistics Fall 2022
Dr. A. B. Dieker
Project
1 Introduction
The objective of this project is to make the statistical tools from the second part of this class come
“alive” with real data.
The syllabus contains the following information about the project:
Students are required to do a group project using financial data. The project will
require you to do some computer programming. This is a ‘hacking’ project more so
than a ‘coding’ project, and it’s totally fine if you don’t have much coding experience.
It is your choice to use R or Python. At the end of the semester, each team is required
to submit a report on your project as well as a Jupyter notebook. The precise deadline
will be set later in the semester. Late submissions can receive a zero grade. The report
cannot exceed 6 pages (excluding cover page, references, and the Jupyter notebook).
You are also required to submit your Jupyter notebook. All team members are expected
to contribute to all aspects of the project work (statistical analysis, coding, writing).
It also contains the following information on how your project will be graded:
The project grade is based on four equally weighted components: correctness, content,
style/mechanics (for the report), and whether the Jupyter notebook satisfies the re-
quirements. Each of the team members receives the same project grade, but the final
exam will have questions on the project so your course grade will take a hit if you don’t
pull your weight in the project.
It’s totally fine if you don’t have much coding experience. It will not be held against you as long as
your project satisfies the requirements detailed in this document. The deadline is specified under
‘submission’ below.
2 Your team
Create a team with a total of three members. You sign up your group through this link. If you
do not have three members yet, please sign up on the right-hand side of that sheet and reach out
to any students with incomplete teams. Your team needs to be complete by Thu 11/10, 11:59pm.
You are responsible for being part of a team by that date and time, we will not make any team
assignments. It is a requirement to have three members in a team. Turning your team assignment
into staring contest because you want to work by yourself or with only one other person is not a
good idea, I will not blink.
3 Data
Choose five ticker symbols from a single sector. You need to choose the sector and you need
to choose the stocks. Choose the time frame for your data, for instance if you are particularly
interested in tech stocks around the dot com bubble, biotech stocks with the coronavirus market
turbulence, etc. Try to be original in all of these choices. (These examples are no longer original,
since I have given them to you.) Your data set needs to include at least 50 trading days per stock
symbol.
Download your data using pandas (if you use Python) or quantmod (if you use R). Be sure that
your data includes both the open and close prices.
You need to use log-returns for each of your stock symbols, and you may assume that each stock
symbol’s log-returns constitute a random sample. The log-return is defined as
log
(
Scloset
Sopent
)
where Scloset and S
open
t are the close and open prices, respectively, on day t. (Here log is natural
log.)
4 Jupyter notebook
Create a Jupyter notebook, using either Python or R as your programming language. Code as
clean as you can. Document your code.
It should be possible for the user to change stock symbols, confidence levels, etc., by changing
variables (i.e., without having to dive into the details of the code).
Your Jupyter notebook needs to have the following capabilities:
• Given one stock symbol, your notebook needs to be able to: (1) Display histograms for your
data (i.e., log-returns) by stock symbol. (2) Display a normal probability plot to see if the
data is approximately normal. (3) Create (approximate) confidence intervals for the means
and variances given a confidence level. (4) Perform a regression of the log-return on time
(i.e., time on the horizontal axis and log-return on the vertical axis).
• Given two stock symbols, your notebook needs to be able to: (1) Test the equality of the two
population means. (2) Perform a regression of one log-return on the other.
All regression output needs to include intercept and slope estimates, a diagram of the data with
the least-squares line, a graphical depiction of residuals, and R2.
5 Report
In your report, make sure you: (1) Describe your data set. (2) Illustrate the functionality of your
Jupyter notebook with examples, namely one from each of the six required notebook capabilities
above. (3) Interpret your output and draw conclusions.
You have to choose what you want to show, such as which of the ticker symbols you use to illustrate
your notebook’s capabilities. I do not want to see endless numbers of figures or tables. Make it
count what you write! Short is often better. Keep the page limit in mind.
6 Getting help
Please reach out to Zitong at zw2690@columbia.edu if you need help.
2
7 Submission
You need to submit a Jupyter notebook and a report.
1. Your Jupyter notebook. The file name of your Jupyter notebook should be UNI1_UNI2_UNI3.ipynb,
where UNI1, UNI2, and UNI3 are the UNIs of the team members in alphabetical order. You
cannot reference any external data files in your code, it needs to be possible to execute the
notebook on my computer. The first lines of your notebook need to specify (in a code com-
ment or text field) the packages/libraries needed to run your code. You may assume I will
have those installed on my machine.
2. Your report in PDF format. The file name of your report should be UNI1_UNI2_UNI3.pdf,
where UNI1, UNI2, and UNI3 are the UNIs of the team members in alphabetical order. Your
cover sheet needs to include the names and UNIs of everybody in the team.
The deadline for submission is 12/9, 11:59pm EST. You need to submit on Courseworks (not
Gradescope). Exactly one person per team should submit, and this person can submit multiple
times before the due date. Should we receive multiple submissions, then only your team’s last
submission will be graded.


essay、essay代写