stata代写-MFIN6210|学霸联盟

stata代写-MFIN6210

时间：2022-03-31

1
MFIN6210 Week 4 (Lab Sessions are in your Zoom/Hybrid/iLab classes, Wednesday’s Lab session: Data analysis I
(manipulation and basic STATA commands). See Week 4 folder on Moodle for additional readings and data. The
Computer Lab classes in Weeks 4 and 5 are in your regular classes using “myAccess”. In Week 3 Moodle I provide
links to STATA and EXCEL https://www.myaccess.unsw.edu.au/ which gives you access to both MICTOSOFT
EXCEL and STATA-SE 14 on your own laptop/computer. It also gives you access at home, as well as at UNSW or on-
line.)

Objective: Manipulate an IPO dataset and estimating some basic STATA commands

The objective of the lab is to create a dataset to model IPO underpricing in STATA. The lab will require you to
(1) create dependent and independent variables in excel using several functions; (2) create a csv file for input
to STATA; (3) estimate standard summary statistic commands; (4) estimate multivariate regression models,
including diagnostic tests; and (5) write a STATA do file to automate the process.

Useful readings:
• Ritter, J and I. Welch, ‘A review of IPO activity, pricing and allocations, Journal of Finance 57, 1795-
1828, 2002 (covered in the lecture).
• Loughran, T, J. Ritter, ‘Why has IPO underpricing changed overtime, Financial Management, autumn, 5-
-37, 2004 (available from the lab folder).

Tasks
1. Download the IPO lab dataset (IPO_lab) from Moodle and open in MS Excel. The first step is to think about
a theoretical framework or model to explain underpricing (Loughran and Ritter’s 2004 paper provides a
useful model). For this exercise, we will simply define underpricing using the % change in stock price at
close, which is given in the spreadsheet. The next step is to create independent variables that proxy
underpricing theories. From the IPO lecture we know several ‘reputation’ or ‘quality’ metrics could capture
offer quality (e.g., information asymmetry related to the lemons problem). I suggest you create an
underwriter dummy and an auditor dummy that capture underwriter and auditor quality. We will also
include other financial characteristics that might be correlated with quality, e.g., offer price or offer size,
profitability prior to offer, and industry characteristics (e.g., high tech). You should think about the
expected correlation between your independent variables and your proxy for underpricing (your
understanding of the theories should inform this).

2. In the STATA_input worksheet, fill in the columns (names are already provided) with the requisite
variables. Use the vlookup function to find the name of the underwriter. I have provided a list of all
underwriters in the ‘Underwriter worksheet’. Auditor name is already given. Create underwriter and
auditor dummies. For the underwriter, it will simply equal=1 if the issue was underwritten, and 0 otherwise
(Hint: use ‘Non underwritten’ to create an IF command). Do the same for auditor, where ‘na’ denotes no
auditor. The high---tech dummy can also be created using the IF function. The remaining columns contain
some profitability data and issue expenses. You will need to create dummies to capture if ebitda is positive,
negative, or no ebitda was reported. Copy and save the worksheet values as a new spreadsheet called
STATA_input_IPOs. Note that your worksheet will contain many nonnumeric characters which STATA will
not be able to read. We will deal with this within STATA.

3. Open STATA. Import your csv file and note the number of observations and variables (should be 357obs
and 15 vars). Run the summary statistics command (statistics---summary statistics). You will note that
2
several columns have no summary stats due to nonnumeric data. For those variables (i.e., first-day-return,
ebitda_before, expenses_percent) you will need to use the destring command (destring var, replace force).
Estimate a pair-wise correlation matrix. Are the correlations between underpricing and the
independent variables consistent with expectations? Note any collinearity issues.

4. Generate a new variable called ln_book_value, defined as ln (offer price * shares outstanding). Use the
gen command (gen var=ln(offer_price*shares_outstanding)).

5. Estimate a regression model of underpricing on offer price, underwriter and auditor dummies, high---
tech dummy, positive ebitda dummy, book value and expenses. Interpret the results.

6. Estimate post regression diagnostics for heteroscekdascity, normality and multicollinearity.

7. Generate a new variable called underpricing_dum (=1 if first_day_return > 0, and 0 otherwise). Use this
variable as the dependent variable in a logit model using the same independent regressors as your OLS
model. Are the results consistent with those for OLS?

8. Create your own STATA do file to automate the above. Open a new do file (from menu: window-do-file-
editor-new-do-file). See useful commands below.
Some very basic commands for STATA
Set more off (turns off the requirement to press the enter key) To
read in your excel file: insheet using "F:\file name.csv"
To replace nonnumeric data: destring variable name, replace force
To generate a new variable, e.g., log book value
gen ln_book_value=ln(offer_price*shares_outstanding)
To generate summary statistics: summarize list variable names To
generate a correlation matrix: pwcorr list variable names
To estimate an ols regression model:
regress dependent variable independent variables
To save your results:
save F:\file name, replace (the replace command replaces any previous version with the same name)

Some useful commands in Excel for summary stats and data manipulation
=AVERAGE(number1, [number2],...)--- calculates the mean value
=STDEV(number1,number2,...) – calculates the standard deviation for a sample
=QUARTILE(array,quart) --- calculates the quartile value, where quart = 0 to 4. Note 1= 1st quartile (25th
percentile), 2=2nd or median (50th percentile), 3= 3rd (75 percentile) and 0=min and 4= max values.
=INTERCEPT(known_y's,known_x's) – calculates the intercept (constant ) in an OLS regression
=SLOPE(known_y's,known_x's) – calculates the slope (beta) in an OLS regression
=PERCENTILE(array,k) – calculates the percentile value, where k = percentile (e.g., 0.9 = 90th percentile. Values are
ranked largest to smallest)
=IF(logical_test, value_if_true, [value_if_false]) = use the IF function
to create dummy variables
=VLOOKUP(lookup_value, table_array, col_index_num, [range_lookup]) - Use Vlookup to find values for firms which
you identify using a unique code (e.g., cusip, gvkey, ticker). You can use this command to combine two datasets into
3
one spreadsheet or to search for financial data from a larger set (e.g., ATP data) for a list of firms that make---up your
estimation sample.
=SUMIF(range, criteria, [sum_range]) - Use SUMIF to sum the values in a range that meet criteria that you specify. For
example, suppose that in a column that contains numbers (e.g., ROA) you want to sum only the values for each
industry in a given year, which you will identify using a unique code (e.g., cusip code plus year). This could be useful in
calculating industry averages, where totals are divided by the number of firms in an industry to give a mean value.
Naturally, this could also be achieved in STATA for those that feel more comfortable with using STATA code.