xuebaunion@vip.163.com

3551 Trousdale Rkwy, University Park, Los Angeles, CA

留学生论文指导和课程辅导

无忧GPA：https://www.essaygpa.com

工作时间：全年无休-早上8点到凌晨3点

扫码添加客服微信

扫描添加客服微信

stata代写-ECON6113/STAT6113

时间：2020-12-13

ECON6113/STAT6113: Term Project Guidelines

Tom Mayock

Fall 2020

The standards and requirements set forth in these guidelines may be modified at any time by the course

instructor. Notice of such changes will be by announcement in class and/or by email.

Due Date: December 9, 2020 at 5:00PM.

NO LATE ASSIGNMENTS WILL BE ACCEPTED.

ALL ASSIGNMENTS MUST BE SUBMITTED VIA THE COURSE CANVAS SITE.

In order for your assignment to be graded, you must

1. Submit your Stata “do file” and Stata log file to Canvas.

2. Submit your written project report to Canvas.

3. Submit your written project report to the plagiarism detection software via Canvas.

Overview

30 percent of your grade in this course will depend on the completion of an empirical project.

I will provide you with the data for this project, which will be extracted from the Freddie Mac

Single Family Loan-Level Dataset. The goal of the project is to develop a “scorecard” to predict

mortgage defaults. The analysis for the project must be conducted in Stata for you to receive

a grade. Furthermore, THE PROJECT MUST BE COMPLETED INDEPENDENTLY. After the

code (or “do-file” in Stata parlance) for the projects is submitted, I will test the performance of

your model on an out-of-time sample. The student that builds the model that exhibits the best

model as measured by the Kolmogorov-Smirnov statistic on the out-of-time sample will have 5

additional points added to her/his final course grade. For example, if your total course grade

based on all other assignments is an 85% and you build the best mortgage scorecard, your final

course grade will be a 90%.

Objective

One of the goals of this course is to develop students’ facility to analyze data to study economic

problems using econometric methods. To that end, you will complete an empirical project that

demonstrates your ability to work with statistical software and data and interpret the results of

1

ECON 6113 Term Project Guidelines

econometric models.

Your “client” for this project is a mortgage lender that wishes to engage in risk-based pricing

for its mortgage loans. The first step in establishing risk-based pricing is the construction of a

mortgage “scorecard” that predicts the probability that a loan defaults. You will develop such a

scorecard using data that I provide you derived from the Freddie Mac Single Family Loan-Level

Dataset; the full version of this data “covers approximately 22.19 million fixed-rate mortgages

originated between January 1, 1999 and June 30, 2015 [that were purchased or guaranteed by

Freddie Mac].” I will be providing you with a small random sample of loans from a particular

origination cohort. You are tasked with developing an econometric model that predicts the

probability that a given loan defaults, where a default is defined as going 60 days-past-due or

entering foreclosure at any point within 4 years of origination. In what follows a “good” account

is one that does not default, whereas a “bad” account is one that does default.

The Model

Let Di be an indicator variable that takes a value of one if loan i defaults within 4 years of

origination and is zero otherwise. This variable is called “BAD_OVER_48_60” in the data file.

Additionally, let Xi be a vector of variables that describe the characteristics of the borrower and

the loan that are available at the time of origination. For the project, you will use the data that I

provide you (the “development data”) to estimate

Pr[Di = 1|Xi] = G(Xiβ) (1)

Let β̂ denote the vector of estimated parameters for Equation 1. After you estimate β̂, your model

will be evaluated on its ability to distinguish between “good” and “bad” accounts as measured

by the Komolgorov-Smirnov (KS) statistic calculated on an out-of-time sample (the “OOT data”).

The student that builds the model with the highest KS statistic on the OOT sample will have 5

points added to her or his final grade.

Since you will all be working with the same development data, variation in the performance of

the models between students will be driven entirely by differences in how you build your mod-

els. In building your model, you may need to create new variables, such as interaction variables

and transformations. All of your data cleaning and model specification decisions, however,

must be explained and justified in your written report.

When building your model, the User Guide for the mortgage data will be of critical importance;

this guide can be found in the “Files-Term Project-Documentation” folder on the course Canvas

page. This file provides an overview of the loan-level data, describes the file layout, and defines

all of the variables that are included in the data.

Data Cleaning

While Freddie Mac has cleaned up the source files significantly, no dataset is perfect. As an

econometrician it is your responsibility to make sure that the data that you are using is correct;

2/7

ECON 6113 Term Project Guidelines

using data that is riddled with errors can have a seriously detrimental impact on your analysis.

As they say: “garbage in, garbage out.”

To make sure that you are working with a “clean” sample, you will likely want to remove some

observations from the data that appear to have been coded incorrectly or that are missing key

data elements. Some suggestions for cleaning your data are listed below.

• How frequently is a variable missing in the data? Will including the variable result in you

losing a large fraction of the overall sample?

• Plot a histogram of all of the variables you are considering for use in your model. Can

you identify any observations that are likely data entry errors? If so, you should consider

removing those observations from the data.

• When merging datasets, always verify that the merge was performed correctly.

• Calculate the summary statistics for all of the variables that you are considering for use in

your models. Do the means, minima, and maxima “make sense?” For example, economists

often work with variables that must logically be positive (e.g., prices) or bounded (e.g.,

fractions that must be between 0 and 1). If your data do not conform to expectations, you

should inspect the data more closely to understand what is going on. Note that in some

databases, values that appear to be bizarre (e.g., -99999 for a price) actually have a specific

meaning that can be gleaned from the codebook.

• ALWAYS READ THE CODEBOOK! In our case, the codebook for the Freddie Mac data is

called the “User Guide.” I cannot stress this enough. The easiest way to run into trouble

when building an econometric model is to just start throwing variables into a model before

you have any understanding of what those variables actually measure.

Data Sampling

Before building your model, you must use random sampling to split your data into two distinct

pieces: a development sample (70%) and a holdout sample (30%). The development sample is

the set of observations that will be used to build your model, and the holdout sample is the set

of observations on which you are to test the performance of your model.

Model Specification

You are free to specify your model as you please. I have, however, included some things you may

want to consider below when specifying your final model.

• The final specification of the model should be based on sound statistical reasoning. Since

you are building a predictive model, this means that you should only include variables in

your models that improve the models’ ability to differentiate between defaulting and non-

defaulting loans. You can identify these variables through traditional hypothesis testing,

an analysis of the model’s overall predictive performance, or some combination of the two

approaches.

3/7

ECON 6113 Term Project Guidelines

• You should think about how to incorporate non-linearities and interactions into your model

to improve model performance.

The Report

To receive credit for the project, you must submit a report via Canvas that contains the follow-

ing elements. The report should be written in clear, concise English as if it was aimed at a

professional audience. DO NOT SIMPLY TURN IN A LIST OF BULLET POINTS.

Data Description

In this portion of the report, you should describe the data that you are using to estimate your

model. Detailed information on the nature of the data can be found in the User Guide. If you

imposed any filters to remove likely outliers, those filters should be described in detail in this

section. For example, if you dropped observations with DTIs in excess of 90 because these are

likely data entry errors, you should state this exclusion in this part of the paper.

The Data Description section should include a table with summary statistics (such as the mean,

median, range, and standard deviation) for any of the variables that you include in your final

model. Please note that this section should only contain information on the variables that you

used to build and test your models; you do not need to discuss variables that are included in the

mortgage data that you do not utilize in your analysis.

You must use random sampling to split your data into two distinct pieces: a development sample

(70%) and a holdout sample (30%). The development sample is the set of observations that will

be used to build your model, and the holdout sample is the set of observations on which you

are to test the performance of your model. Describe how these samples were constructed in this

portion of the report.

Model Description

In this portion of the report, you should write down your model in mathematical notation. For

example, if Di is the default indicator and Xi is the vector of regressors, you should write down

your default model as

Pr[Di = 1|Xi = 1] = G(Xiβ)

For this project, you will use the logistic link function for G().

Lastly, and perhaps most importantly, this section should clearly state what your model is

designed to do, define the dependent variable, and describe your expectations for the rela-

tionship between each of the variables and mortgage default. For example, if you include DTI

in the model, you need to explain whether or not you expect DTI to increase default risk and

why you expect such a relationship to hold.

4/7

ECON 6113 Term Project Guidelines

Commentary on Initial Model Specification

Discuss in this section how you initially specified your model. Was your specification informed

by economic theory? Do you have expectations for the signs of any of the variables?

Next, discuss any initial testing that you conducted that you used to arrive. For example, if you

initially included DTI in your model but found that it was statistically insignificant, state this in

the report.

Final Model Output and Commentary

You must include a regression table that includes the estimated coefficients, standard errors,

and corresponding levels of statistical significance for your final specification. Below that table,

you should discuss and interpret your results. For example, if you included DTI in the model

because you expected higher DTIs to be associated with higher default rates, you should discuss

whether the final model results were consistent with expectations. In this section you should

emphasize the ceteris paribus interpretation of the regression coefficients. Turning back to the

DTI example, if you have a positive and statistically significant coefficient on the DTI term, you

should emphasize that, all else equal, borrowers with higher DTIs are more likely to default. In

this section you also must perform a marginal effects analysis and discuss the impact of the

variables in your model on the probability – not the log-odds – of default.

Model Performance Testing

After developing your final model using the development data, you must perform an analysis of

out-of-sample performance and compare this against in-sample performance. For this analysis,

the “score” is simply the predicted probability that a loan defaults based on your model. This

analysis must contain the following elements.

• An assessment of the discriminatory power of your model based on the KS statistic. If s

denotes the score, then the KS statistic is defined as

KS ≡ max

s

(F (s|B)− F (s|G))

where F (s|B) and F (s|G) are the cumulative distribution functions for the scores condi-

tional on the account being bad and good, respectively. Calculate KS using your develop-

ment data, and then calculate KS using your holdout data. Interpret the results. Does the

ability of your model to differentiate between good and bad accounts appear to be stable

out of sample?

• An assessment of the model’s predictive accuracy. To conduct this assessment, first partition

your scores into ten groups based on the deciles of the PD distribution. Within each decile,

sum over all of the predicted PDs in the decile to calculate the expected number of bads

within the decile. Calculate the expected number of good accounts similarly by summing

over the values of PD of the accounts that are within the decile. Formally, this expected bad

calculation can be written as

B̂d =

N

∑

i=1

PDi1id

5/7

ECON 6113 Term Project Guidelines

Expected Actual Expected Actual

Band Range Bad (B̂d) Bad (Bd) Good (Ĝd) Good (Gd)

1 (0, p1] ∑Ni=1 PDi1i1 B1 ∑

N

i=1 (1− PDi) 1i1 N1 − B1

2 (p1, p2] ∑Ni=1 PDi1i2 B2 ∑

N

i=1 (1− PDi) 1i2 N2 − B2

3 (p2, p3] ∑Ni=1 PDi1i3 B3 ∑

N

i=1 (1− PDi) 1i3 N3 − B3

4 (p3, p4] ∑Ni=1 PDi1i4 B4 ∑

N

i=1 (1− PDi) 1i4 N4 − B4

5 (p4, p5] ∑Ni=1 PDi1i5 B5 ∑

N

i=1 (1− PDi) 1i5 N5 − B5

6 (p5, p6] ∑Ni=1 PDi1i6 B6 ∑

N

i=1 (1− PDi) 1i6 N6 − B6

7 (p6, p7] ∑Ni=1 PDi1i7 B7 ∑

N

i=1 (1− PDi) 1i7 N7 − B7

8 (p7, p8] ∑Ni=1 PDi1i8 B8 ∑

N

i=1 (1− PDi) 1i8 N8 − B8

9 (p8, p9] ∑Ni=1 PDi1i9 B8 ∑

N

i=1 (1− PDi) 1i9 N9 − B9

10 (p9, 1] ∑Ni=1 PDi1i10 B10 ∑

N

i=1 (1− PDi) 1i10 N10 − B10

Table 1: An “Expected-Versus-Actual” Table.

where PDi is the estimated probability of default for account i, N denotes the total number

of accounts in the sample, and 1id is an indicator variable that is equal to 1 if account i is

included in decile or “band” d.

The number of expected goods in decile d can be written as

Ĝd =

N

∑

i=1

(1− PD)i 1id

• Construct a table that compares the expected goods (Ĝd) and actual goods (Gd) and the

expected bads (B̂d) and actual bads (Bd) for each of the score bands. Repeat this calculation

for the development sample and the holdout sample. Table 1 is a template for these

calculations. Based on these results, how well does your model predict default within the

bands? Is the accuracy of these predictions stable out-of-sample?

Code Submission

A key portion of the project is running your code on an out-of-time data sample. IF YOUR CODE

DOES NOT RUN ON THIS OUT-OF-TIME SAMPLE, YOU WILL NOT RECEIVE CREDIT

FOR THE PROJECT. To ensure that your code can be run on this holdout sample, your code must

be written in a manner that satisfies the requirements listed below. You will find sample code in

the “Files-Term-Project-StataCode” folder entitled “Basic_Regression_OutOfSample_LOGIT_W_TESTS.do”

that you can use to structure your code so that it conforms with these expectations.

• At the beginning of your do-file, define the directory that contains the data using the fol-

lowing local macro syntax: local data_dir “ZZZ” where ZZZ is the directory that contains

the data for the project. When I run your code on the out-of-time data, this directory will

be swapped to the directory on my computer where the out-of-time data resides.

6/7

ECON 6113 Term Project Guidelines

• All variable transformations and other data cleaning steps must be performed by calling a

separate do-file from within your main do-file. This do-file must be named “data_cleaning.do.”

To properly score the observations in the out-of-time data, the same variable transforma-

tions and data cleaning steps that were used in the construction of the initial model must

also be applied on the out-of-time observations. When I run your code to score the out-of-

time data, “data_cleaning.do” will be called to perform the necessary steps.

Plagiarism Detection

As a condition of taking this course, all required papers may be subject to submission for textual

similarity review to Turnitin.com via Canvas for the detection of plagiarism. All submitted

papers will be included as source documents in the Turnitin.com reference database solely for

the purpose of detecting plagiarism of such papers. No student papers will be submitted to

Turnitin.com without a student’s written consent and permission. If a student does not provide

such written consent and permission, the instructor may: (i) require a short reflection paper on

research methodology; (ii) require a draft bibliography prior to submission of the final paper; or

(iii) require the cover page and first cited page of each reference source to be photocopied and

submitted with the final paper

Grading

The grade that you receive for your term project will depend on 4 components: content (20%),

execution (20%), interpretation (30%), and writing (30%). The term project grading rubric, which

can be found on Canvas in the “Files-Term Project” folder, contains detailed information on how

scores for each of these components are determined.

7/7

Tom Mayock

Fall 2020

The standards and requirements set forth in these guidelines may be modified at any time by the course

instructor. Notice of such changes will be by announcement in class and/or by email.

Due Date: December 9, 2020 at 5:00PM.

NO LATE ASSIGNMENTS WILL BE ACCEPTED.

ALL ASSIGNMENTS MUST BE SUBMITTED VIA THE COURSE CANVAS SITE.

In order for your assignment to be graded, you must

1. Submit your Stata “do file” and Stata log file to Canvas.

2. Submit your written project report to Canvas.

3. Submit your written project report to the plagiarism detection software via Canvas.

Overview

30 percent of your grade in this course will depend on the completion of an empirical project.

I will provide you with the data for this project, which will be extracted from the Freddie Mac

Single Family Loan-Level Dataset. The goal of the project is to develop a “scorecard” to predict

mortgage defaults. The analysis for the project must be conducted in Stata for you to receive

a grade. Furthermore, THE PROJECT MUST BE COMPLETED INDEPENDENTLY. After the

code (or “do-file” in Stata parlance) for the projects is submitted, I will test the performance of

your model on an out-of-time sample. The student that builds the model that exhibits the best

model as measured by the Kolmogorov-Smirnov statistic on the out-of-time sample will have 5

additional points added to her/his final course grade. For example, if your total course grade

based on all other assignments is an 85% and you build the best mortgage scorecard, your final

course grade will be a 90%.

Objective

One of the goals of this course is to develop students’ facility to analyze data to study economic

problems using econometric methods. To that end, you will complete an empirical project that

demonstrates your ability to work with statistical software and data and interpret the results of

1

ECON 6113 Term Project Guidelines

econometric models.

Your “client” for this project is a mortgage lender that wishes to engage in risk-based pricing

for its mortgage loans. The first step in establishing risk-based pricing is the construction of a

mortgage “scorecard” that predicts the probability that a loan defaults. You will develop such a

scorecard using data that I provide you derived from the Freddie Mac Single Family Loan-Level

Dataset; the full version of this data “covers approximately 22.19 million fixed-rate mortgages

originated between January 1, 1999 and June 30, 2015 [that were purchased or guaranteed by

Freddie Mac].” I will be providing you with a small random sample of loans from a particular

origination cohort. You are tasked with developing an econometric model that predicts the

probability that a given loan defaults, where a default is defined as going 60 days-past-due or

entering foreclosure at any point within 4 years of origination. In what follows a “good” account

is one that does not default, whereas a “bad” account is one that does default.

The Model

Let Di be an indicator variable that takes a value of one if loan i defaults within 4 years of

origination and is zero otherwise. This variable is called “BAD_OVER_48_60” in the data file.

Additionally, let Xi be a vector of variables that describe the characteristics of the borrower and

the loan that are available at the time of origination. For the project, you will use the data that I

provide you (the “development data”) to estimate

Pr[Di = 1|Xi] = G(Xiβ) (1)

Let β̂ denote the vector of estimated parameters for Equation 1. After you estimate β̂, your model

will be evaluated on its ability to distinguish between “good” and “bad” accounts as measured

by the Komolgorov-Smirnov (KS) statistic calculated on an out-of-time sample (the “OOT data”).

The student that builds the model with the highest KS statistic on the OOT sample will have 5

points added to her or his final grade.

Since you will all be working with the same development data, variation in the performance of

the models between students will be driven entirely by differences in how you build your mod-

els. In building your model, you may need to create new variables, such as interaction variables

and transformations. All of your data cleaning and model specification decisions, however,

must be explained and justified in your written report.

When building your model, the User Guide for the mortgage data will be of critical importance;

this guide can be found in the “Files-Term Project-Documentation” folder on the course Canvas

page. This file provides an overview of the loan-level data, describes the file layout, and defines

all of the variables that are included in the data.

Data Cleaning

While Freddie Mac has cleaned up the source files significantly, no dataset is perfect. As an

econometrician it is your responsibility to make sure that the data that you are using is correct;

2/7

ECON 6113 Term Project Guidelines

using data that is riddled with errors can have a seriously detrimental impact on your analysis.

As they say: “garbage in, garbage out.”

To make sure that you are working with a “clean” sample, you will likely want to remove some

observations from the data that appear to have been coded incorrectly or that are missing key

data elements. Some suggestions for cleaning your data are listed below.

• How frequently is a variable missing in the data? Will including the variable result in you

losing a large fraction of the overall sample?

• Plot a histogram of all of the variables you are considering for use in your model. Can

you identify any observations that are likely data entry errors? If so, you should consider

removing those observations from the data.

• When merging datasets, always verify that the merge was performed correctly.

• Calculate the summary statistics for all of the variables that you are considering for use in

your models. Do the means, minima, and maxima “make sense?” For example, economists

often work with variables that must logically be positive (e.g., prices) or bounded (e.g.,

fractions that must be between 0 and 1). If your data do not conform to expectations, you

should inspect the data more closely to understand what is going on. Note that in some

databases, values that appear to be bizarre (e.g., -99999 for a price) actually have a specific

meaning that can be gleaned from the codebook.

• ALWAYS READ THE CODEBOOK! In our case, the codebook for the Freddie Mac data is

called the “User Guide.” I cannot stress this enough. The easiest way to run into trouble

when building an econometric model is to just start throwing variables into a model before

you have any understanding of what those variables actually measure.

Data Sampling

Before building your model, you must use random sampling to split your data into two distinct

pieces: a development sample (70%) and a holdout sample (30%). The development sample is

the set of observations that will be used to build your model, and the holdout sample is the set

of observations on which you are to test the performance of your model.

Model Specification

You are free to specify your model as you please. I have, however, included some things you may

want to consider below when specifying your final model.

• The final specification of the model should be based on sound statistical reasoning. Since

you are building a predictive model, this means that you should only include variables in

your models that improve the models’ ability to differentiate between defaulting and non-

defaulting loans. You can identify these variables through traditional hypothesis testing,

an analysis of the model’s overall predictive performance, or some combination of the two

approaches.

3/7

ECON 6113 Term Project Guidelines

• You should think about how to incorporate non-linearities and interactions into your model

to improve model performance.

The Report

To receive credit for the project, you must submit a report via Canvas that contains the follow-

ing elements. The report should be written in clear, concise English as if it was aimed at a

professional audience. DO NOT SIMPLY TURN IN A LIST OF BULLET POINTS.

Data Description

In this portion of the report, you should describe the data that you are using to estimate your

model. Detailed information on the nature of the data can be found in the User Guide. If you

imposed any filters to remove likely outliers, those filters should be described in detail in this

section. For example, if you dropped observations with DTIs in excess of 90 because these are

likely data entry errors, you should state this exclusion in this part of the paper.

The Data Description section should include a table with summary statistics (such as the mean,

median, range, and standard deviation) for any of the variables that you include in your final

model. Please note that this section should only contain information on the variables that you

used to build and test your models; you do not need to discuss variables that are included in the

mortgage data that you do not utilize in your analysis.

You must use random sampling to split your data into two distinct pieces: a development sample

(70%) and a holdout sample (30%). The development sample is the set of observations that will

be used to build your model, and the holdout sample is the set of observations on which you

are to test the performance of your model. Describe how these samples were constructed in this

portion of the report.

Model Description

In this portion of the report, you should write down your model in mathematical notation. For

example, if Di is the default indicator and Xi is the vector of regressors, you should write down

your default model as

Pr[Di = 1|Xi = 1] = G(Xiβ)

For this project, you will use the logistic link function for G().

Lastly, and perhaps most importantly, this section should clearly state what your model is

designed to do, define the dependent variable, and describe your expectations for the rela-

tionship between each of the variables and mortgage default. For example, if you include DTI

in the model, you need to explain whether or not you expect DTI to increase default risk and

why you expect such a relationship to hold.

4/7

ECON 6113 Term Project Guidelines

Commentary on Initial Model Specification

Discuss in this section how you initially specified your model. Was your specification informed

by economic theory? Do you have expectations for the signs of any of the variables?

Next, discuss any initial testing that you conducted that you used to arrive. For example, if you

initially included DTI in your model but found that it was statistically insignificant, state this in

the report.

Final Model Output and Commentary

You must include a regression table that includes the estimated coefficients, standard errors,

and corresponding levels of statistical significance for your final specification. Below that table,

you should discuss and interpret your results. For example, if you included DTI in the model

because you expected higher DTIs to be associated with higher default rates, you should discuss

whether the final model results were consistent with expectations. In this section you should

emphasize the ceteris paribus interpretation of the regression coefficients. Turning back to the

DTI example, if you have a positive and statistically significant coefficient on the DTI term, you

should emphasize that, all else equal, borrowers with higher DTIs are more likely to default. In

this section you also must perform a marginal effects analysis and discuss the impact of the

variables in your model on the probability – not the log-odds – of default.

Model Performance Testing

After developing your final model using the development data, you must perform an analysis of

out-of-sample performance and compare this against in-sample performance. For this analysis,

the “score” is simply the predicted probability that a loan defaults based on your model. This

analysis must contain the following elements.

• An assessment of the discriminatory power of your model based on the KS statistic. If s

denotes the score, then the KS statistic is defined as

KS ≡ max

s

(F (s|B)− F (s|G))

where F (s|B) and F (s|G) are the cumulative distribution functions for the scores condi-

tional on the account being bad and good, respectively. Calculate KS using your develop-

ment data, and then calculate KS using your holdout data. Interpret the results. Does the

ability of your model to differentiate between good and bad accounts appear to be stable

out of sample?

• An assessment of the model’s predictive accuracy. To conduct this assessment, first partition

your scores into ten groups based on the deciles of the PD distribution. Within each decile,

sum over all of the predicted PDs in the decile to calculate the expected number of bads

within the decile. Calculate the expected number of good accounts similarly by summing

over the values of PD of the accounts that are within the decile. Formally, this expected bad

calculation can be written as

B̂d =

N

∑

i=1

PDi1id

5/7

ECON 6113 Term Project Guidelines

Expected Actual Expected Actual

Band Range Bad (B̂d) Bad (Bd) Good (Ĝd) Good (Gd)

1 (0, p1] ∑Ni=1 PDi1i1 B1 ∑

N

i=1 (1− PDi) 1i1 N1 − B1

2 (p1, p2] ∑Ni=1 PDi1i2 B2 ∑

N

i=1 (1− PDi) 1i2 N2 − B2

3 (p2, p3] ∑Ni=1 PDi1i3 B3 ∑

N

i=1 (1− PDi) 1i3 N3 − B3

4 (p3, p4] ∑Ni=1 PDi1i4 B4 ∑

N

i=1 (1− PDi) 1i4 N4 − B4

5 (p4, p5] ∑Ni=1 PDi1i5 B5 ∑

N

i=1 (1− PDi) 1i5 N5 − B5

6 (p5, p6] ∑Ni=1 PDi1i6 B6 ∑

N

i=1 (1− PDi) 1i6 N6 − B6

7 (p6, p7] ∑Ni=1 PDi1i7 B7 ∑

N

i=1 (1− PDi) 1i7 N7 − B7

8 (p7, p8] ∑Ni=1 PDi1i8 B8 ∑

N

i=1 (1− PDi) 1i8 N8 − B8

9 (p8, p9] ∑Ni=1 PDi1i9 B8 ∑

N

i=1 (1− PDi) 1i9 N9 − B9

10 (p9, 1] ∑Ni=1 PDi1i10 B10 ∑

N

i=1 (1− PDi) 1i10 N10 − B10

Table 1: An “Expected-Versus-Actual” Table.

where PDi is the estimated probability of default for account i, N denotes the total number

of accounts in the sample, and 1id is an indicator variable that is equal to 1 if account i is

included in decile or “band” d.

The number of expected goods in decile d can be written as

Ĝd =

N

∑

i=1

(1− PD)i 1id

• Construct a table that compares the expected goods (Ĝd) and actual goods (Gd) and the

expected bads (B̂d) and actual bads (Bd) for each of the score bands. Repeat this calculation

for the development sample and the holdout sample. Table 1 is a template for these

calculations. Based on these results, how well does your model predict default within the

bands? Is the accuracy of these predictions stable out-of-sample?

Code Submission

A key portion of the project is running your code on an out-of-time data sample. IF YOUR CODE

DOES NOT RUN ON THIS OUT-OF-TIME SAMPLE, YOU WILL NOT RECEIVE CREDIT

FOR THE PROJECT. To ensure that your code can be run on this holdout sample, your code must

be written in a manner that satisfies the requirements listed below. You will find sample code in

the “Files-Term-Project-StataCode” folder entitled “Basic_Regression_OutOfSample_LOGIT_W_TESTS.do”

that you can use to structure your code so that it conforms with these expectations.

• At the beginning of your do-file, define the directory that contains the data using the fol-

lowing local macro syntax: local data_dir “ZZZ” where ZZZ is the directory that contains

the data for the project. When I run your code on the out-of-time data, this directory will

be swapped to the directory on my computer where the out-of-time data resides.

6/7

ECON 6113 Term Project Guidelines

• All variable transformations and other data cleaning steps must be performed by calling a

separate do-file from within your main do-file. This do-file must be named “data_cleaning.do.”

To properly score the observations in the out-of-time data, the same variable transforma-

tions and data cleaning steps that were used in the construction of the initial model must

also be applied on the out-of-time observations. When I run your code to score the out-of-

time data, “data_cleaning.do” will be called to perform the necessary steps.

Plagiarism Detection

As a condition of taking this course, all required papers may be subject to submission for textual

similarity review to Turnitin.com via Canvas for the detection of plagiarism. All submitted

papers will be included as source documents in the Turnitin.com reference database solely for

the purpose of detecting plagiarism of such papers. No student papers will be submitted to

Turnitin.com without a student’s written consent and permission. If a student does not provide

such written consent and permission, the instructor may: (i) require a short reflection paper on

research methodology; (ii) require a draft bibliography prior to submission of the final paper; or

(iii) require the cover page and first cited page of each reference source to be photocopied and

submitted with the final paper

Grading

The grade that you receive for your term project will depend on 4 components: content (20%),

execution (20%), interpretation (30%), and writing (30%). The term project grading rubric, which

can be found on Canvas in the “Files-Term Project” folder, contains detailed information on how

scores for each of these components are determined.

7/7

- 留学生代写
- Python代写
- Java代写
- c/c++代写
- 数据库代写
- 算法代写
- 机器学习代写
- 数据挖掘代写
- 数据分析代写
- Android代写
- html代写
- 计算机网络代写
- 操作系统代写
- 计算机体系结构代写
- R代写
- 数学代写
- 金融作业代写
- 微观经济学代写
- 会计代写
- 统计代写
- 生物代写
- 物理代写
- 机械代写
- Assignment代写
- sql数据库代写
- analysis代写
- Haskell代写
- Linux代写
- Shell代写
- Diode Ideality Factor代写
- 宏观经济学代写
- 经济代写
- 计量经济代写
- math代写
- 金融统计代写
- 经济统计代写
- 概率论代写
- 代数代写
- 工程作业代写
- Databases代写
- 逻辑代写
- JavaScript代写
- Matlab代写
- Unity代写
- BigDate大数据代写
- 汇编代写
- stat代写
- scala代写
- OpenGL代写
- CS代写
- 程序代写
- 简答代写
- Excel代写
- Logisim代写
- 代码代写
- 手写题代写
- 电子工程代写
- 判断代写
- 论文代写
- stata代写
- witness代写
- statscloud代写
- 证明代写
- 非欧几何代写
- 理论代写
- http代写
- MySQL代写
- PHP代写
- 计算代写
- 考试代写
- 博弈论代写
- 英语代写
- essay代写
- 不限代写
- lingo代写
- 线性代数代写
- 文本处理代写
- 商科代写
- visual studio代写
- 光谱分析代写
- report代写
- GCP代写
- 无代写
- 电力系统代写
- refinitiv eikon代写
- 运筹学代写
- simulink代写
- 单片机代写
- GAMS代写
- 人力资源代写
- 报告代写
- SQLAlchemy代写
- Stufio代写
- sklearn代写
- 计算机架构代写
- 贝叶斯代写
- 以太坊代写
- 计算证明代写
- prolog代写
- 交互设计代写
- mips代写
- css代写
- 云计算代写
- dafny代写
- quiz考试代写
- js代写
- 密码学代写
- ml代写
- 水利工程基础代写
- 经济管理代写
- Rmarkdown代写
- 电路代写
- 质量管理画图代写
- sas代写
- 金融数学代写
- processing代写
- 预测分析代写
- 机械力学代写
- vhdl代写
- solidworks代写
- 不涉及代写
- 计算分析代写
- Netlogo代写
- openbugs代写
- 土木代写
- 国际金融专题代写
- 离散数学代写
- openssl代写
- 化学材料代写
- eview代写
- nlp代写
- Assembly language代写
- gproms代写
- studio代写
- robot analyse代写
- pytorch代写
- 证明题代写
- latex代写
- coq代写
- 市场营销论文代写
- 人力资论文代写
- weka代写
- 英文代写
- Minitab代写
- 航空代写
- webots代写
- Advanced Management Accounting代写
- Lunix代写
- 云基础代写
- 有限状态过程代写
- aws代写
- AI代写
- 图灵机代写
- Sociology代写
- 分析代写
- 经济开发代写
- Data代写
- jupyter代写
- 通信考试代写
- 网络安全代写
- 固体力学代写
- spss代写
- 无编程代写
- react代写
- Ocaml代写
- 期货期权代写
- Scheme代写
- 数学统计代写
- 信息安全代写
- Bloomberg代写
- 残疾与创新设计代写
- 历史代写
- 理论题代写
- cpu代写
- 计量代写
- Xpress-IVE代写
- 微积分代写
- 材料学代写
- 代写
- 会计信息系统代写
- 凸优化代写
- 投资代写
- F#代写
- C#代写
- arm代写
- 伪代码代写
- 白话代写
- IC集成电路代写
- reasoning代写
- agents代写
- 精算代写
- opencl代写
- Perl代写
- 图像处理代写
- 工程电磁场代写
- 时间序列代写
- 数据结构算法代写
- 网络基础代写
- 画图代写
- Marie代写
- ASP代写
- EViews代写
- Interval Temporal Logic代写
- ccgarch代写
- rmgarch代写
- jmp代写
- 选择填空代写
- mathematics代写
- winbugs代写
- maya代写
- Directx代写
- PPT代写
- 可视化代写
- 工程材料代写
- 环境代写
- abaqus代写
- 投资组合代写
- 选择题代写
- openmp.c代写
- cuda.cu代写
- 传感器基础代写
- 区块链比特币代写
- 土壤固结代写
- 电气代写
- 电子设计代写
- 主观题代写
- 金融微积代写
- ajax代写
- Risk theory代写
- tcp代写
- tableau代写
- mylab代写
- research paper代写
- 手写代写
- 管理代写
- paper代写
- 毕设代写
- 衍生品代写
- 学术论文代写
- 计算画图代写
- SPIM汇编代写
- 演讲稿代写
- 金融实证代写
- 环境化学代写
- 通信代写
- 股权市场代写
- 计算机逻辑代写
- Microsoft Visio代写
- 业务流程管理代写
- Spark代写
- USYD代写
- 数值分析代写
- 有限元代写
- 抽代代写
- 不限定代写
- IOS代写
- scikit-learn代写
- ts angular代写
- sml代写
- 管理决策分析代写
- vba代写
- 墨大代写
- erlang代写
- Azure代写
- 粒子物理代写
- 编译器代写
- socket代写
- 商业分析代写
- 财务报表分析代写
- Machine Learning代写
- 国际贸易代写
- code代写
- 流体力学代写
- 辅导代写
- 设计代写
- marketing代写
- web代写
- 计算机代写
- verilog代写
- 心理学代写
- 线性回归代写
- 高级数据分析代写
- clingo代写
- Mplab代写
- coventorware代写
- creo代写
- nosql代写
- 供应链代写
- uml代写
- 数字业务技术代写
- 数字业务管理代写
- 结构分析代写
- tf-idf代写
- 地理代写
- financial modeling代写
- quantlib代写
- 电力电子元件代写
- atenda 2D代写
- 宏观代写
- 媒体代写
- 政治代写
- 化学代写
- 随机过程代写
- self attension算法代写
- arm assembly代写
- wireshark代写
- openCV代写
- Uncertainty Quantificatio代写
- prolong代写
- IPYthon代写
- Digital system design 代写
- julia代写
- Advanced Geotechnical Engineering代写
- 回答问题代写
- junit代写
- solidty代写
- maple代写
- 光电技术代写
- 网页代写
- 网络分析代写
- ENVI代写
- gimp代写
- sfml代写
- 社会学代写
- simulationX solidwork代写
- unity 3D代写
- ansys代写
- react native代写
- Alloy代写
- Applied Matrix代写
- JMP PRO代写
- 微观代写
- 人类健康代写
- 市场代写
- proposal代写
- 软件代写
- 信息检索代写
- 商法代写
- 信号代写
- pycharm代写
- 金融风险管理代写
- 数据可视化代写
- fashion代写
- 加拿大代写
- 经济学代写
- Behavioural Finance代写
- cytoscape代写
- 推荐代写
- 金融经济代写
- optimization代写
- alteryxy代写
- tabluea代写
- sas viya代写
- ads代写
- 实时系统代写
- 药剂学代写
- os代写
- Mathematica代写
- Xcode代写
- Swift代写
- rattle代写
- 人工智能代写
- 流体代写
- 结构力学代写
- Communications代写
- 动物学代写
- 问答代写
- MiKTEX代写
- 图论代写
- 数据科学代写
- 计算机安全代写
- 日本历史代写
- gis代写
- rs代写
- 语言代写
- 电学代写
- flutter代写
- drat代写
- 澳洲代写
- 医药代写
- ox代写
- 营销代写
- pddl代写
- 工程项目代写
- archi代写
- Propositional Logic代写
- 国际财务管理代写
- 高宏代写
- 模型代写
- 润色代写
- 营养学论文代写
- 热力学代写
- Acct代写
- Data Synthesis代写