xuebaunion@vip.163.com

3551 Trousdale Rkwy, University Park, Los Angeles, CA

留学生论文指导和课程辅导

无忧GPA：https://www.essaygpa.com

工作时间：全年无休-早上8点到凌晨3点

扫码添加客服微信

扫描添加客服微信

Python代写-FINA 5840-Assignment 1

时间：2021-04-15

FINA 5840: Financial Modeling Assignment 1

Halis Sak

Due date: April 16, 2021

Question. We will work with firm characteristics data that we downloaded from WRDS for this

homework assignment.

a) Read “data.csv” data to a Pandas dataframe (“df”). And print the dimension of the created

dataframe, “df”. How many number of rows and columns exist in the data?

b) Please do the following to pre-process the data in the given order.

• Get rid of the rows for which “next_ret” is a string

• Change the column names of “df” to lowercase

• Split the data into train and test (train: 1980 to 1999 and test: 2000 to 2019). Please name these

two new dataframes as “df_train” and “df_test”.

c) We want to compute correlation between “logmmt” and “mmt6” features on training data. Some of

the values are NaN for these features. Please do the following steps in the given order.

• Step 1. import numpy Python package

• Step 2. Create a boolean pandas series, “bool_index_finite”, such that ith element of “bool_index_finite”

should be True if ith element of “logmmt” and “mmt6” are finite. Otherwise, it should have a value

of False. Hint. You can use isfinite function of numpy package to check whether a value is finite

and & operator to combine multiple booleans when using numpy.

• Step 3. Use “bool_index_finite” as a boolean index of “df_train.logmmt” and “df_train.mmt6” to

choose the rows of “df_train” for which both “logmmt” and “mmt6” are finite. Then, you can simply

use corrcoef function of numpy package to compute the correlation between “logmmt” and “mmt6”

that are both finite.

d) We want to create a new feature using “logmmt” and “mmt6” features. If both “logmmt” and

“mmt6” are greater than zero then the new feature “mmt_dir” should be equal to 1, otherwise it should

be equal to 0. As opposed to part c, we need to use a for loop this time. Please do the following steps

in the given order.

• Step 1. Use zeros function of of numpy package to create an array of size number of rows of train

data, and assign this to “mmt_dir” column of “df_train”.

• Step 2. Write a traditional for loop that iterates over the rows of train data

– Step 3. Check whether both of “logmmt” and “mmt6” values for the current row of train data

are greater than zero. If both of “logmmt” and “mmt6” values are greater than zero then change

“mmt_dir” to 1.

1

学霸联盟

Halis Sak

Due date: April 16, 2021

Question. We will work with firm characteristics data that we downloaded from WRDS for this

homework assignment.

a) Read “data.csv” data to a Pandas dataframe (“df”). And print the dimension of the created

dataframe, “df”. How many number of rows and columns exist in the data?

b) Please do the following to pre-process the data in the given order.

• Get rid of the rows for which “next_ret” is a string

• Change the column names of “df” to lowercase

• Split the data into train and test (train: 1980 to 1999 and test: 2000 to 2019). Please name these

two new dataframes as “df_train” and “df_test”.

c) We want to compute correlation between “logmmt” and “mmt6” features on training data. Some of

the values are NaN for these features. Please do the following steps in the given order.

• Step 1. import numpy Python package

• Step 2. Create a boolean pandas series, “bool_index_finite”, such that ith element of “bool_index_finite”

should be True if ith element of “logmmt” and “mmt6” are finite. Otherwise, it should have a value

of False. Hint. You can use isfinite function of numpy package to check whether a value is finite

and & operator to combine multiple booleans when using numpy.

• Step 3. Use “bool_index_finite” as a boolean index of “df_train.logmmt” and “df_train.mmt6” to

choose the rows of “df_train” for which both “logmmt” and “mmt6” are finite. Then, you can simply

use corrcoef function of numpy package to compute the correlation between “logmmt” and “mmt6”

that are both finite.

d) We want to create a new feature using “logmmt” and “mmt6” features. If both “logmmt” and

“mmt6” are greater than zero then the new feature “mmt_dir” should be equal to 1, otherwise it should

be equal to 0. As opposed to part c, we need to use a for loop this time. Please do the following steps

in the given order.

• Step 1. Use zeros function of of numpy package to create an array of size number of rows of train

data, and assign this to “mmt_dir” column of “df_train”.

• Step 2. Write a traditional for loop that iterates over the rows of train data

– Step 3. Check whether both of “logmmt” and “mmt6” values for the current row of train data

are greater than zero. If both of “logmmt” and “mmt6” values are greater than zero then change

“mmt_dir” to 1.

1

学霸联盟