1
EE104 Module 6 Project Ideas
Version 04/06/2021
(This is a working file, i.e. more projects will be added frequently. Check out often for the latest version)
If you don’t have a project idea yet, maybe one of the ideas below will spark your interest.
Table of Contents
EE104
Module 6 Project Ideas
......................................................................................................................
1
Table of Contents
..........................................................................................................................................
1
1 - Trend Prediction
......................................................................................................................................
1
2 - Data Science
............................................................................................................................................
1
3 - Data Science – Data Relationship
...........................................................................................................
1
4 - Data Science Statistics
.............................................................................................................................
1
Do Problem #1 (60%) and any one of the problem #2 or #3 or #4 (40%). No extra credit.
1 - Trend Prediction
Search for COVID-19 data in CSV file format or any format that can be converted to CSV.
Select 3-month worth of data (i.e. March to May). Plot a selected data criterion for that 3 months and
predict the trend for the next 3 months into the future (i.e. June to August). Compare your prediction
with reality. Stating whether your prediction is correct or not. Research and present the reasons for your
data match/mismatch.
Submit source(s), code, and output screenshots to Canvas.
2 - Data Science
Download a CSV file from this website: http://www.creditriskanalytics.net/datasets-private2.html
Write a Python program to analyze the risk factors that cause the loan defaults and provide a report to
the bank for 3 groups: low risk, medium risk, and high risk.
3 - Data Science – Data Relationship
Using data that you can find from the website below, write a Python program to find the relationships
among 3 different data criteria (that you will select) using Pearson Correlation Coefficient and Chi-
Square Test of Independence. Show plot(s) to prove your findings.
https://guides.lib.berkeley.edu/publichealth/healthstatistics/rawdata
4 - Data Science Statistics
Using the Salaries.csv file from the link below for the San Francisco area, clean the data as needed, and
calculate the values of +/- 1 sigma, +/-2 sigma, and +/-3 sigma. Plot the data to prove your findings.
https://www.kaggle.com/kaggle/sf-salaries?select=Salaries.csv
学霸联盟