无代写-ECON30025/ ECOM90020

时间：2022-06-09

J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
1
Tutorials
Week Tutorial Topic Tutorial Exercises
1 1. Intro to SAS program
2 2. Creating and combining data series Data manipulation in SAS
3 3. Graphics methods overview Representing data
4 4. Linear algebra basics Practice examples
5 5. Multivariate Analysis in IML Numerical Methods in IML
6 6. Analysis of the Australian IO tables Input Output in IML
7 7. Assignment #1 Review
8 8. Linear programming in Excel Example solutions for LP problems
9 9. Doing DEA Example DEA setups
10 10. Quantile Regression Using IML and Proc Quantreg
11 11. The Simulation of an inventory A Queuing problem
12 12. Assignment #2 Review

Contents
Tutorials ............................................................................................................................................................. 1
1. Introduction to SAS ...................................................................................................................................... 2
2. Creating and combining data series .............................................................................................................. 3
3 Graphics Methods Overview ......................................................................................................................... 8
4. Linear algebra basics................................................................................................................................... 17
5. Programming Multivariate Statistics in IML .............................................................................................. 19
6. Analysis of the Australian IO tables ........................................................................................................... 24
8. Linear Programming ................................................................................................................................... 31
9. Data Envelopment Analysis ........................................................................................................................ 32
10. LAV and Quantile Regression. ................................................................................................................. 36
11. Inventory simulation. ................................................................................................................................ 38

J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
2
1. Introduction to SAS

From this tutorial you should have the following takeaways:

Learn features of the SAS computer system

o To run SAS code from your computer using myuniapps.
 Finding files on your computer.
 Pasting code into the editor.
 Cutting and pasting results into a word file.
o To be able to perform the following steps in the interactive mode.
 Change editor styles.
 To read a log file.
 To examine data sets.
 To use SAS output files.
o To navigate SAS help files.
 Finding the details of the syntax of different Proc commands.
 Locating different function commands that can be used in Data files.
 When in doubt use the internet to search for answers.

SAS is a proprietary program that can only be accessed in the full version via a site licence from
SAS. Usually, this would not pose a problem at the University of Melbourne since we have a licence for all
computers at Melbourne. However, due to the limitations on the return to campus we are unable to use the
labs and tutorial rooms that have the program installed we need to use the myuniapps website1 that allows
the use of a CITRIX connection to a server on campus. This method is the easiest method for access but
requires a continuous connection to the university server. It can be accessed via computers with different
operating systems with the appropriate CITRIX receiver software. Be advised that this software has recently
been updated and if you have a older version you need to update it and restart your computer. At some point
in the use of this method you will need to allow the system to access your computer's files. Later in the
subject you will also find that you can get access to other software such as Eviews, Stata, and Scientific
Workplace that we will use for some limited examples.2
I have put all the data that needs to be read on a public website as csv files that can be read from
anywhere. SAS will read these so that it is not necessary to store them on your computer. The cloud
version of SAS will read them as well as the myuniapps version.

This tute is primarily a demonstration tute to show how these tasks can be done.

1 Myuniapps website can be found at: https://myuniapps.unimelb.edu.au/vpn/index.html Instructions for downloading the
CITRIX software can be found there. This only needs to be done once.
2 Another option for off-campus access to SAS is "SAS University Edition". You can find this at:
https://www.sas.com/en_au/software/university-edition/download-software.html where you can find extensive descriptive videos
on how to download and install this program. It does require a bit of patience. It is more complex than the myuniapps approach
in that it requires that you install a virtual computer on your computer but once you have done this the program resides on your
computer. In the past I would have said this was the best way however SAS has decided to discontinue this approach for students
to move to a cloud based approach only in August of this year with the last chance to download this version is April 30, 2021.
The cloud based option is the "SAS OnDemand" software that can be found at: https://www.sas.com/en_au/software/on-demand-
for-academics/references/getting-started-with-sas-ondemand-for-academics-studio.html . This software option is running SAS
Studio which has a slightly different interface from the SAS I use and is available from the myuniapps site. The code is the same
and the results are the same as well. The editor is slightly different. However, the reliance on the same underlying program
means that all the routines we use here will run in this environment. Even if you have difficulty reading your computer's files you
can cut and paste the programs (they are all text files) into the on-line source code editor. There are numerous videos of how this
operates and there is on line support. To use this approach, you will need to create a profile.

J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
3
2. Creating and combining data series

From this tutorial you should have the following takeaways:

Understand the concept of the two forms of SAS programs the DATA step and the PROC step.

o To be able to perform the following steps in the data steps.
 To use only a subset of the observations (rows).
 To keep or drop a subset of the variables (columns) in the data set.
 To create multiple dataset with one data step.
 To combine multiple datasets by stacking them on top of each other.
 To combine multiple datasets by concatenating or merging them.
 To reconfigure a dataset from the wide view with multiple columns to a long view with
more observations into a single column (as done with the weather data).

o To be able to use some basic PROCs and follow the syntax of the commands.
 To estimate a regression with automatic dummy variables using Proc glm.
 To generate multiple plots categorised by a variable.
 To use a PROC to create a summary data set

The purpose of this tutorial is to provide some experience with SAS to create data sets of varying
types for analysis.
You may download the programs for this tutorial from the LMS programs web-site. There is a zip
file with all the programs we have (note I will be updating these as we have more topics). Alternatively, you
can download individual ones as well.
First, open SAS,

J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
4
Load the footy_new program from where you have placed the unzipped programs.

Once you have read the program It should look like

The parts of the footy_new routine

J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
5
1. We read data from a csv file that contains data from the AFL website that lists the outcome of all the
regular season football games from 1970 to 2006. Note this file is located on a website that we can
access from off-campus as well as on-campus. This means you do not need to copy the data to a
separate location.
2. From this data we create a new set with some transformed data and we label the variables for latter
use.
3. We then demonstrate how we can summarize the data by creating summary data series.
4. Then we read another csv file of weather data as provided by the Australian Bureau of Meteorology
website.
5. This data is available by month (daily data is more difficult to obtain). These observations can then
be used to define a monthly time series from 1900 to 2008 for rainfall in Melbourne as measured in
millimetres.
6. It is then possible to take these two data sets and determine if the attendance at football games at the
MCG is influenced by the amount of rainfall during the month.

Extra Questions:
1. Try creating some other combinations of the game data such as home goal accuracy or away
team accuracy.
2. Add the month the game was played as another variable to explain changes in the attendance.
To do this you need to add month to the regression.

* footy_new
Read Footy data from the excel file
This data is from the AFL website
;
Title "Footy Analysis" ;
filename csvFile url
"https://www.online.fbe.unimelb.edu.au/t_drive/ECOM/ECOM90020/data/footy.csv"
termstr=crlf;

proc import datafile=csvFile out=footy replace dbms=csv; run;
*
Add labels to the names
;
Data footy1 ; set footy ;
*
Create a SASdate variable and determine
the day of the week the game was played.
A measure of kicking accuracy on the day by the ratio of total kicks on goal
to the number of goals.
Also add a format for the date.
;
date = mdy(mon,day,year) ; format date date7. ;
dw = weekday(date) ;
time = hour + min/60 ;
accuracy = t_g/(t_g+t_b) ;

*
Create a new variable that records which team won
and a variable for if the it was a home win or not.
Note the draws have the winner as 'Draw'
;
if a_t > h_t then do; winner = Away ; Home_win = 1 ; end ;
else do; winner = Home ; Home_win = 0 ; end ;
if a_t = h_t then do; winner = 'Draw' ; Home_win = .5 ; end ;
*
Create labels for the variables to identify them later
;
label
Home_win = Flag for a home win
winner = Team that won
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
6
dw = "Day of week (1=sun)"
time = Hour and fraction
date = date the gam played
round = Round in which played
w_marg = Winning Margin
Home = Home team
H_g = Home team goals
H_b = Home team behinds
H_t = Home team total
Away = Away team
A_g = Away team goals
A_b = Away team behinds
A_t = Away team total
Venue = Where played
Attend = Number of people attending
day = Day played
mon = Month played
year = Year Played
hour = Hour started
min = Minute started
t_b = Total number of behinds
t_g = Total number of goals
accuracy = Goals to shots on goal;
run;
*
Sort the data by match winner then year
;
proc sort data=footy1 ; by winner year; run;
*
Create a new data set that provides the number of wins for each team,
the average winning margin, and the maximum of the winning margin.
Note we exclude the draws from this new data set.
;
proc summary data=footy1(where=(winner ~= 'Draw')) ; by winner year; var w_marg ;
output out=tot_club n=N_wins mean=avg_WM max=max_wm; run;

proc sgpanel data=tot_club;
panelby winner / columns=3 ;
series y=n_wins x=year ; run;
*
Read the monthly rainfall in mm data for Melbourne
This data is from the Australian BOM web site.
;
filename csvFile url
"https://www.online.fbe.unimelb.edu.au/t_drive/ECOM/ECOM90020/data/rain.csv"
termstr=crlf;

proc import datafile=csvFile out=rain replace dbms=csv; run;

*
Transform the data read by rows to a regular time series by year and month.
;
data rain1 ; set rain ;
array xx m1-m12 ; * Define an array over the monthly variables ;
do mon = 1 to 12 ; * Create a new variable for the month ;
rain = xx[mon] ; * The variable rain is created for the monthly value;
output ;
end;
drop m1--Annual ;
Label
rain = Monthly mm of rain in Melbourne;
run;
*
Create an average attendance data series for the MCG by year and month
;
proc sort data=footy1; by year mon ;
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
7
proc summary data=footy1(where=(venue='MCG')) ; by year mon ;
var attend w_marg accuracy;
output out=MCG_attn mean=avg_attn avg_wm avg_accur min(attend)=min_attn ; run;
*
Merge the weather data with the attendance data
;
data total ; merge rain1 mcg_attn(in=i1) ; by year mon ;
if i1 ; * Only keep the observations when it matches attendence data ;
run;
*
Run a simple regression to see if rain matters to average
monthly attendance, minimum monthly attendance or kicking accuracy.
where the years are included as dummy variables
Use the program to compute the implied average value of the dependent variable
By year
;
proc glm data=total ; class year home_win;
model avg_attn min_attn avg_accur = rain year /solution ;
lsmeans year ; quit; run;

J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
8
3 Graphics Methods Overview

From this tutorial you should have the following takeaways:

Understand the concepts of the appropriate graphic displays of cross-section data, time-series data
and 3-dimensional representations.

o To be able to perform the following steps in the in the plots of cross section data.
 To label the variables in a large data set to help define results.
 To generate distribution plots with histograms, parametric density plots and kernel
density plots.
 To generate matrix scatterplots.
 To superimpose density plots.
 To plot side-by-side boxplots.
 To generate new versions of the daily sales to be daily sales per customer.
 To reconfigure a dataset from the wide view with multiple columns to a long view with
more observations into a single column while adding a name.
 Use Proc Format to generate a variable code.
 Use Proc GLM to estimate the relationship between income and per customer sales.

o To be able to perform the following steps in the in the plots of time series data.
 To label the daily data for a particular store for a number of years.
 To redefine the daily sales into daily per customer sales.
 To use one of the SAS functions for handling dates to identify the day of the week.
 To plot all the daily values of the per customer beer sales.
 To summarise the data by week to create a new data set with the average weekly per
customer sales and plot the weekly average beer per customer sales.
 Use the Proc glm routine to investigate the seasonal and calendar effects in the demand
for beer.

o To be able to understand the elements in generating three dimensional plots.
 Use a data step to generate a data set based on a particular 3D function.
 To use Proc g3d to plot this function.
 To change the perspective of the plot.
 To generate a data set with incomplete coverage of a 3-D surface.
 To generate a 3-D pillar plot of the incomplete data.
 To use Proc g3grid to interpolate values in a 3-D space based on a smoothing
algorithm.
 To use Proc g3d to plot the smoothed values.
 To generate a contour plot of the 3-D shape in two dimensions using Proc gcontour.

This tutorial is designed to demonstrate how you can construct some of the basic plots shown in the notes
that pertain to scatter diagrams and other plots. For this we will use SAS as well as the added feature for
interactive data analysis.

For this tutorial we review graphic representations of two types of data: Cross Section and Time
Series.

J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
9
Cross Section Data Graphics
Use the program entitled read_graphics_data1.

1. We first read the data for 84 Food stores that provides the average daily sales in each department and
the census data for the population that lives in the same postal area as the store. Since this data
comes from a simple file we need to add labels to the variables. We call this data set graphics.
2. Then we generate a scatter plot matrix of the sales in five departments with the implied normal
density plot (based on the mean and standard deviation of the data) and a histogram of the data on
the diagonal.
3. Then we compare the normal densities by use of overlapping densities.
4. In this step we perform the same analysis as in #3 except in this case we estimate the densities using
kernel density estimates.
5. In this step we create a new data set called next that redefines the values of wine, meat, dairy,
cheese, and bakery to be the log of the sales per customer. Note that we need to avoid very small
and zero values, so we make them missing values.
6. We repeat the scatterplot matrix plot from before with the histogram and normal plot on the diagonal
where we now use the log of the per customer sales.
7. Create a new data set from the next data to form a longer series with the log of per customer sales
stacked on top of each other in a new variable called sales. We call this dataset graph_long. We
then only keep the store number, the sales, the name of the department, and the number that is
associated with the department. This data set has 420 observations but only 5 variables (note we add
the neighbourhood log income).
8. Create a code or format for the new variable in the dataset graph_long. This procedure allows us to
refer to a code to interpret the numbers in the variable type. The name of the format is deps.
9. Generate side-by-side box-plots of the variables. We do this in two ways using two different SAS
procedures. In the second case we require that the data be sorted by type.
10. Now we can estimate a regression for each type of department. This regression demonstrates how
demand varies by income a relationship that is often referred to as an Engle Curve.

* read_graphics_data1
This routine reads the store data used for the graphics description
into SAS where the data have been saved as an CSV file.

This data is for 84 stores located in Chicago and records the monthly
sales in the departments of the store.
In addition, a set of values have been added from the
US Census that describes the neighbourhood around the store.
More details of this data can be found at:
https://www.chicagobooth.edu/research/kilts/datasets/dominicks
;
options ls=80 ;
Title "CS Graphics Examples" ;
filename csvFile url
"https://www.online.fbe.unimelb.edu.au/t_drive/ECOM/ECOM90020/data/graphics_ddf_stores1.csv"
termstr=crlf;

proc import datafile=csvFile out=graphics replace dbms=csv; run;

data graphics; set graphics;
*
Define the labels for the variables
;
c = 1 ;
label
AGE60 = "% aged 60 or over"
AGE9 = "% aged 9 or under"
BAKERY = "Bakery Sales"
BEER = "Beer Sales"
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
10
BULK = "Bulk Food Sales"
CAMERA = "Camera Sales"
CHEESE = "Cheese Case"
CITY = "Location"
CONVFOOD = "Convenience Food"
COSMETIC = "Cosmetics sales"
CUSTCOUN = "Customer Count"
DAIRY = "Dairy Sales"
DELI = "Deli Services"
DELIEXPR = "Deli Express"
DELISELF = "Deli Self Serv"
DENSITY = "Population density - people per sq. mile"
EDUC = "% University Graduates"
ETHNIC = "% of Blacks and Hispanics"
FISH = "Fish Sales"
FLORAL = "Floral Sales"
FROZEN = "Frozen Sales"
GM = "General Merchandising Sales"
GROCERY = "Grocery Sales"
HABA = "Health and Beauty Aids"
HH3PLUS = "% that have more than 3 person HH"
HH4PLUS = "% that have more than 4 person HH"
HHLARGE = "% that have 5 or more in HH"
HHSINGLE = "% that are single HH"
HSIZE1 = "% that are 1 person HH"
HSIZE2 = "% that are 2 person HH"
HSIZE34 = "% that are 3 and 4 person HH"
HSIZE567 = "% that are 5, 6, and 7 person HH"
HSIZEAVG = "Average HH size"
HVAL150 = "% of Houses with value > 150,000"
HVAL200 = "% of Houses with value > 200,000"
HVALMEAN = "Mean House value in 000s"
INCOME = "Log Median HH income"
INCSIGMA = "SD of average Income (level)"
JEWELRY = "Jewellery Sales"
LAT = "Latitude"
LONG = "Longitude"
MEAT = "Meat Sales"
MEATFROZ = "Meat-Frozen Sales"
MORTGAGE = "% with mortgages"
NOCAR = "% with no car"
NUM = "Number of observations"
NWHITE = "% who are non-white"
NWRKCH = "% Not Working with Children"
NWRKCH17 = "% Not Working Women with Children 17 +"
NWRKCH5 = "% Not Working Women with Children 5 and under"
PHOTOFIN = "Photo Finish Sales"
POVERTY = "% under poverty income level"
PRODUCE = "Produce Sales (fruit and Veg)"
RETIRED = "% that are retired"
SALADBAR = "Salad Bar Sales"
SHOPINDX = "% Car and Single House HH (shopper index)"
SHPAVID = "% Avid Shoppers"
SHPBIRD = "% of Shopping Birds"
SHPCONS = "% Constrained Shoppers"
SHPHURR = "% Hurried Shoppers"
SHPKSTR = "% of Shopping Strangers"
SHPUNFT = "% Unfettered Shoppers"
SINGLE = "% that are single"
SINHOUSE = "% HH that are Single Houses (Detached)"
SPIRITS = "spirits Sales"
STORE = "Store Number"
TELEPHN = "% with telephones"
TOT_SALE = "Total average daily sales "
UNEMP = "% that are unemployed"
VIDEO = "Video Sales"
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
11
VIDEOREN = "Video Rentals"
WEEKVOL = "Average Weekly sales volume"
WINE = "Wine Sales"
WORKWOM = "% of HH with Women that Work"
WRKCH = "% Working Women with Children"
WRKCH17 = "% Working Women with Children 17 +"
WRKCH5 = "% Working Women with Children 5 and under"
ZIP = "Zip Code for store location";
;;
run;
*
Plot the scatterplot matrix of Dairy, Meat, Cheese, Bakery, and the customer count
where each store department is listed with a separate variable.
;
Title2 "Scatterplot Matrix of original data" ;
proc sgscatter data=graphics ;
matrix wine dairy meat cheese bakery / diagonal=(histogram normal);
run;

Title2 "The Normal Densities" ;
proc sgplot data=graphics ;
density wine/ legendlabel='wine';
density dairy / legendlabel='dairy';
density meat / legendlabel='meat';
density cheese / legendlabel='cheese';
density bakery / legendlabel='bakery'; run;

Title2 "The Kernel Densities" ;
proc sgplot data=graphics ;
density wine / type=kernel(weight=quadratic) legendlabel='dairy';
density dairy / type=kernel(weight=quadratic) legendlabel='dairy';
density meat / type=kernel(weight=quadratic) legendlabel='meat';
density cheese / type=kernel(weight=quadratic) legendlabel='cheese';
density bakery / type=kernel(weight=quadratic) legendlabel='bakery';

run;
*
Redefine the values to indicate sales per customer in a new data set called next
;
data next ; set graphics ;
*
Use an array statement to define a temporary name for the variables in the list
;
array xx wine meat dairy cheese bakery ;
*
Cycle over the variables divide them by the customer count and take logs
to create variable that is the log of the sales per customer.
Set the cases were the sales are less than 1 to a missing value
;
do over xx ;
if xx > 1 then xx = log(xx / custcoun) ; else xx = . ;
end;
run;
*
Produce a scatterplot of the per customer sales
;
Title2 "Scatterplot Matrix of the log of per customer sales" ;
proc sgscatter data=next ;
matrix wine dairy meat cheese bakery / diagonal=(histogram normal);
run;
*
Re-structure the data by Creating a long version of the data with one variable
for the log sales per customer
;
data graph_long ; set next ;
length d_n1-d_n5 $ 6. ;
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
12
array dept cheese wine bakery dairy meat ;
array d_name $ d_n1-d_n5 ;
retain d_n1-d_n5 ("cheese" "wine" "bakery" "dairy" "meat" ) ;
do type = 1 to 5 ;
sales = dept[type] ;
department = d_name[type];
output ;
end ;
label
sales = log of per customer sales
department = Name of Department
type = numeric value for department ;
keep store sales department type income; run;
*
Create a format for the type variable that translates the number to a
name for each department
;
proc format ;
value deps
1 = cheese
2 = wine
3 = bakery
4 = dairy
5 = meat ; run;

*
Plot side by side boxplots of the log of per customer department sales
using the type of department and a format for the names of the departments.
;
title2 "Side-by_side Box-plots" ;
proc sgplot data=graph_long ; format type deps. ;
hbox sales/group=type ;
xaxis label = 'Log Sales per Customer'; run;

*
Plot overlapping kernel densities for the log of per customer department sales
;
title2 "Kernel Density Plots" ;
proc sgplot data=graph_long ; format type deps. ;
density sales/group=type type=kernel ;
xaxis label = 'Log Sales per Customer'; run;
*
Use the boxplot routine that computes summary statistics as well as plotting
side-by-side boxplots.
Note to use this method we need to sort the data to collect the
results for each Department.
;
proc sort data=graph_long ; by type ; run;

title2 "Side-by_side Box-plots with Statistics" ;
proc boxplot data=graph_long ; format type deps. ;
plot sales*type ;
inset min mean max stddev /
header = 'Overall Statistics'
pos = tm;
insetgroup min mean max /
header = 'By Department';
run;
*
Using the log of the income estimate an Engle curve for these sales
;
Title2 "Estimate of the Engle Curves" ;
proc glm data=graph_long ; by type ; format type deps. ;
model sales = income income*income / solution ; quit; run;

J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
13
Time Series Graphics
This program uses a set of similar data to the cross section data for the sales by one of the DDF
stores (number 12) for over 2,735 days from Sept 7, 1989 to April 30, 1997. This data is daily data.
The steps we follow in this program are very similar to the ones we used in the cross section
program.
For this tutorial you are to run this program and interpret the results as they pertain to the demand for
beer.

* ts_graphics
This routine reads times series data for a store data
into SAS where the data have been saved as an CSV file.

This data is daily data for store #12 located in Chicago and records the sales
by day for 2735 days from Sept 7, 1989 to April 30, 1997. (there are a few gaps).
More details of this data can be found at:
https://www.chicagobooth.edu/research/kilts/datasets/dominicks
;

options ls=80 ;
Title "Time Series Graphics Examples" ;
filename csvFile url
"https://www.online.fbe.unimelb.edu.au/t_drive/ECOM/ECOM90020/data/DDF_TS_data.csv"
termstr=crlf;

proc import datafile=csvFile out=ccount replace dbms=csv; run;

data ccount ; set ccount;
array xx beer spirits wine meat dairy cheese bakery ;
array yy beer1 spirits1 wine1 meat1 dairy1 cheese1 bakery1 ;
*
Cycle over the variables divide them by the customer count
to create variable that is sales per customer.
This creates a new set of variables called:
beer1 spirits1 wine1 meat1 dairy1 cheese1 bakery1
Set the cases were the sales are less than 1 to a missing value
;
do over xx ;
if xx > 1 then yy = (xx / custcoun) ; else yy = . ;
end;
d_o_w = weekday(n_date); * Create a day of the week variable =1 for sunday;
label
GROCERY = "Grocery Sales"
DAIRY = "Dairy Sales"
DAIRY1 = "Dairy Sales per customer"
FROZEN = "Frozen Sales"
MEAT = "Meat Sales"
MEAT1 = "Meat Sales per customer"
MEATFROZ = "Meat-Frozen Sales"
FISH = "Fish Sales"
PRODUCE = "Produce Sales"
SALADBAR = "Salad Bar Sales"
FLORAL = "Floral Sales"
DELI = "Deli Services"
CHEESE = "Cheese Case Sales"
CHEESE1 = "Cheese Case Sales per customer"
BAKERY = "Bakery Sales"
BAKERY1 = "Bakery Sales per customer"
PHARMACY = "Pharmacy sales"
GM = "General Merchandising Sales"
HABA = "Health and Beauty Aids"
PHOTOFIN = "Photo Finish Sales"
BEER = "Beer Sales"
BEER1 = "Beer Sales per customer"
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
14
WINE = "Wine Sales"
WINE1 = "Wine Sales per customer"
SPIRITS = "Spirits Sales"
SPIRITS1 = "Spirits Sales per customer"
CUSTCOUN = "Customer Count"
yr = Year
mon = Month
day = Day
n_date = Date value
d_o_w = "Day of week Sunday=1 ..."
;
run;
*
First try a plot of the raw per customer beer sales
;
title2 "Plot of daily values";
proc sgplot data=ccount ;
series y=beer1 x=n_date; run;
*
To smooth this out we can compute the average per customer sales
by week
;
proc sort data=ccount ; by week ; run;
proc summary data=ccount ; by week ;
var beer1 spirits1 wine1 meat1 dairy1 cheese1 bakery1 ;
output out=w_ccount mean= ; run;
symbol l=1 i=join ;
*
now plot the new data that has the weekly average values
;
title2 "Plot of weekly averages" ;
proc gplot data=w_ccount;
plot beer1*week; run;
*
Look for the calendar influences by month
by running a regression on the dummies for months
;
Title "Investigating the Calendar effects" ;
proc glm data=ccount; class mon ; id n_date ;
model beer1 = mon /solution ;
lsmeans mon ; run;
*
Look for the calendar influences by day of the week
by running a regression on the dummies for day of week
;
proc glm data=ccount; class d_o_w; id n_date ;
model beer1 = d_o_w /solution ;
lsmeans d_o_w; run;
*
Look for the calendar influences by day of the week and month
by running a regression on the dummies for day of week and those for each month
;
proc glm data=ccount; class mon d_o_w; id n_date ;
model beer1 = mon d_o_w /solution ;
lsmeans mon d_o_w; run;

Three dimensional plots
1. In this section you will use the program three_d_plot to perform the following steps:
2. Generate a set of values for the z variable based on a known function in a data set called hat. In this
example the hat function defined by:

( )2 2sinz x y= +
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
15
Where x and y range from -5 to 5 by .25 increments.
3. Once generated, look at the data. It should have 1681 observations.
4. Using the g3d proc generate a 3-D surface image of the data.
5 Use the tilt= and the rotate= options to change the viewpoint of the plot. For example the SAS
statement plot y*x=z/rotate=20 tilt=60; will rotate the x and y axes by 30 degrees and tilt the x and y
plane by 60 degrees. The rotate degrees can be up to 360 and the tilt can be up to 90.
6. Generate another set of values for z using the same function but this time randomly choose the
values of x and y to evaluate it. These values will be in the data set nums. This data set has only 100
observations.
7. Use the command scatter Y * X = Z / shape='pillar' color='CX7C95CA' zmin=0;
In the proc g3d program to create a 3-D plot of the data. (note the plot command will not work for this data
since it has incomplete number of observations.
8. Use the g3grid program to generate predicted values to fill the data and create a new dataset called
smoothed.
9. Plot the smoothed version of the data with g3d note how this differs from the plot we made from the
original data.
10. As an alternative we can plot this data using a contour plot that is in two dimensions in a similar
manner to a topographical map that shows the elevations of terrain. To generate this we use the proc
gcontour routine.
* three_d_plot
3d Plot routine in SAS
;

/* Set the graphics environment */
goptions reset=all border cback=white htitle=12pt;

/* Create the data set HAT */
data hat;
do x=-5 to 5 by 0.25;
do y=-5 to 5 by 0.25;
z=sin(sqrt(x*x+y*y));
output;
end;
end;
run;

/* Define a title for the plot */
title1 'Surface Plot of HAT Data Set';

/* Create the plot */
proc g3d data=hat;
plot y*x=z;
run;
*
Demonstrate that we can sample 100 values from the Hat shape
and use the interpolation routines to find the underlying function
;
title Demonstration of the smoothing of a sample of data;
data nums;
keep x y z;
do i=1 to 100;
x=10*ranuni(33)-5; * ranuni calls a uniform random number generator ;
y=10*ranuni(35)-5;
z=sin(sqrt(x*x+y*y));
output;
end;
run;
*
look at the pillar plot of the sampled data
;
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
16
title2 The pillar plot of the sampled data ;
proc g3d data=nums;
scatter Y * X = Z / shape='pillar' color='CX7C95CA' zmin=0;
run;

*
Use g3drid to interpolate values between the observations
;
proc g3grid data=nums out=spline;
grid y*x=z / spline
smooth = .05
axis1=-5 to 5 by .25
axis2=-5 to 5 by .25;
run;

*
We cannot plot the raw data in nums with a plot statement with g3d
so we use the interpolated data generated by g3grid to Generate a surface plot
;
title2 The plot of the splined values ;
proc g3d data=spline;
plot y*x=z;
run;
*
We can also plot the contour plot of the data
;
proc gcontour data=smoothed;
plot y*x=z / autolabel;
run;
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
17
4. Linear algebra basics

From this tutorial you should have the following takeaways:

o Understand the concepts of the basic linear algebra methods as they apply to the solution of a set of
linear equations.

• To understand the definitions of vectors and matrices.
• To be familiar with the rules of matrix algebra for addition and subtraction.
• To be familiar with the rules of matrix multiplication.
• To know characteristics of matrices: trace, transpose, inverse and determinant.
• To be aware of the properties of matrices such as being square and being symmetric.
• To be able to find the equivalent matrices and vectors to define a system of linear equations.
• To demonstrate how the solution to a system of linear equations can be found using matrix algebra

o To understand the basic form of the Proc IML procedure in SAS.

• To define matrices and vectors
• To perform matrix algebra computations in Proc IML.
• To display results from the computations from Proc IML procedures.
• To check results from the computations in Proc IML.
• To show how different inversion methods have differing costs.

In this tutorial, we will demonstrate the use of Proc IML to solve some simple linear algebra
problems.

Following the program IML_Example:

* IML_example1
Example of using Proc IML to solve a linear algebra problem.
Solve the problem
3x1 - 2x2 + 5x3 = 20
x2 + 6x3 = 10
10x1 -20x3 = -5
Rewrite in matrix form as Ax = d
;
Title Example of the solution to a set of equations; run;
;
Proc IML;
*
Define the matrix A
;
A = { 3 -2 5, 0 1 6, 10 0 -20} ;
print A ;
*
Compute the inverse of A and print it
;
iA = inv(A);
print iA ;
*
Define the vector d and print it
;
d = {20, 10, -5};
print d;
*
Solve the system of linear equations for the value of x and print it
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
18
;
x = iA * d ;
print x;
quit; run ;

1. Solve the system of linear equations given by:

1 5 2 3
5 3 4
1 3 4
3 4 5
2 3 5
5 2 3 10
/ 2 7 4
( ) / 5 5
/ 4 20 5
10 25
x x x x
x x x
x x x
x x x
x x x
+ + + =
− − + = −
+ + =
+ + + =
+ − =

a) Define the matrix A and the vectors x and y to write this system in matrix the form defined
as: Ax = y.

b) Find the inverse of A.

c) Check the inverse of A by multiplying A-1A.

d) Find the solution vector x.

e) Show that this is a solution.

J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
19
5. Programming Multivariate Statistics in IML
From this tutorial you should have the following takeaways:

o Understand the programing steps required to estimate a multiple regression.

• To how to read data into Proc IML to define vector and matrices
• To be able to construct the necessary vectors and matrices to estimate regressions.
• To use different matrix commands to estimate the regression parameters.
• To use the SAS procedures to estimate regressions.
• To be aware of the properties of matrices such as being square and being symmetric.

o Understand the programing steps required to estimate principal components

• To use the SAS routine to compute correlation matrices.
• To perform matrix computations in Proc IML to estimate the covariance and correlation matrices.
• To perform matrix computations in Proc IML to estimate the eigenvalues and eigenvectors of the
correlation matrix
• To demonstrate how to approximate a matrix with a subset of eigenvalues and eigenvectors.
• To generate the principal components based on the correlation matrix.
• To create a new dataset from a matrix computed in the IML routine.
• To use the SAS routine to compute principal components based on the covariance matrix.
• To see how regression results based on the use of principal components has the same fit as the model
fit to the raw data.

Question 1.
Using the data from Tutorial 2 for the Grocery stores (read_graphics_data1) estimate a regression of
the log of income on the demand for Bakery Goods by modifying the program IML_reg1.sas on the t-drive;
And the IML program IML_reg1 that needs to be modified to read the data. The parts of the IML routine in
red will need to be changed so that there is only one x variable and the y variable is the dependent variable.

* IML_reg1
To run an IML routine to estimate a regression we can use either
program that is already written as we would by using proc reg or
we can write our own program.
;
Title An example of regression using proc IML ;
*
Define the dependent variable and the matrix of regressors
;
data one ;
input y x1 x2;
cards ;
1 2 4
2 0 2

3 4 12
4 6 0
5 8 16
;
run;
*
Use the OLS routine Proc Reg to compute the regression
;
Title Regression results from proc reg; run;
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
20
proc reg data = one ;
model y = x1 x2 ;
run;

;*
Alternatively we can compute the least squares regression estimates
using matrix programming commands
;
Title Regression results from IML ;
Proc IML ;
use one ;

read all var{y} into y ;
read all var{x1 x2} into x ;
t = nrow(x) ; * Find the number of observations ;
x = j(t,1)||x; * Concatenate a vector of ones to the X matrix ;
k = ncol(x) ; * Find the number of columns ;
xpx = x`*x ; * Define the X`X matrix of cross products ;
ixpx = inv(xpx) ; * Invert the X`X matrix ;
b = ixpx*x`*y ; * Solve the least squares problem ;
sse = (y-x*b)`*(y-x*b) ; * Compute the sum of the squared errors ;
sig = sse/ (t-k) ; * Divide by the number of degrees of freedom ;
se = sqrt(sig*vecdiag(ixpx)); * Define the standard errors of the parameters ;
t_stat = b / se ; * Compute the t-statistics for each parameter ;
print (b||se||t_stat) ; * Check Results ;

quit;
run;
Then check your results using Proc Reg.
Question 2.
Using the IML routine PCA_Example as listed below: Use a set of different quality of life
indicators in the original program. In this case use EMP_RATIO_15, GINI, HDI, LGDP_CAP,and
PRODUCTIVITY.
Which component(s) explain over 85% of the variance in these variables?
Use these variables to explain the level of C02 with a regression.
How much of the variation in C02 (as measured by the adjusted the R-squared) do we explain with
only the component scores for those components that explained over 85% of the variation in the x’s?
What property of the component scores shows up when we compare the parameter estimates from
the models with all the scores and only a few?

/* PCA_Example
A program to read a file and perform the principal component analysis on the data
using proc IML to generate the eigenvalue decomposition of the correlation matrix.

Read a comma-separated values file from the web
of UN cross country quality of life (QLF) indicators.
*/
Title "PCA Example" ;
filename csvFile url
"https://www.online.fbe.unimelb.edu.au/t_drive/ECOM/ECOM90020/data/un_plus.csv"
termstr=crlf;

proc import datafile=csvFile out=un_plus replace dbms=csv; run;
/*
Set up the sas data set with labels for each variable
*/
data un_plus ; set un_plus ;

label
ARTICLES= "Scientific articles per capita"
CO2= "Carbon dioxide emissions per capita, (tonnes), 2011"
EMP_RATIO_15= "Employment to population ratio, (% ages 15 and older), 2013"
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
21
EN_SEC= "Gross enrolment ratio, Secondary, (% of secondary school age population), 2008"
EN_TER= "Gross enrolment ratio, Tertiary, (% of tertiary school age population), 2008-2012"
EQ_MATH= "Education quality, Performance of 15-year-old students, Mathematics, 2012"
EQ_READING= "Education quality, Performance of 15-year-old students, Reading, 2012"
EQ_SCIENCE= "Education quality, Performance of 15-year-old students, Science, 2012"
EQ_SEC= "Education quality, Population with at least some secondary education, (% ages 25+)"
FER_2010= "Total fertility rate, (births per woman),2010/2015"
GDP_CAP= "Gross Domestic Product per capita"
GINI= "GINI index (World Bank estimate)"
GR_HDI= "Average annual HDI growth, (%), 1990-2014"
HDI= "Rank of Human Development Index 2013"
HDI_VALUE= "HDI, Value, 2014"
IMMIGRANTS= "Human mobility, Stock of immigrants (% of population), 2013"
INEQ_GEN= "Gender Inequality Index Value, 2014"
INEQ_PALMA= "Income inequality, Palma ratio20052013"
INEQU_GINI= "Income inequality, Gini coefficient, 2005-2013"
INEQU_QUIN= "Income inequality, Quintile ratio, 2005-2013"
INT_STUDENTS= "Human mobility, International student mobility, (% of total tertiary enrolment)"
INTERNET= "Communication, Internet users, (% of population), 2014"
LGDP_CAP= "Log GDP per capita"
LIFE_EXP= "Healthy life expectancy at birth"
MED_AGE= "Population, Median age, (years), 2015"
MIGRATION= "Human mobility, Net migration rate, (per 1,000 people), 2010/2015"
MORT_INF= "Mortality rates, (per 1,000 live births), Infant, 2013"
POP= "Population, Total, (millions), 2014"
POP_GR_2000= "Population, Average annual growth=, 2000/2005"
POP_GR_2010= "Population, Average annual growth=, 2010/2015"
POP_OV_65= "Population, Ages 65 and older, (millions), 2014"
POP_URBAN= "Population, Urban, (%), 2014"
PRIS_POP= "Prison population, (per 100,000 people), 20022013"
PRODUCTIVITY= "Labour productivity, Output per worker, (2011 PPP $), 2005-2012"
R_N_D= "Research and development expenditure (% of GDP), 20052012"
SCH_YR_F= "Mean years of schooling(years), Female, 2014"
SCH_YR_M= "Mean years of schooling(years), Male, 2014"
SEX_RATIO= "Sex ratio at birth, (male to female births), 2010/2015"
SOILSUIT= "Soil fertility"
TEMP= "Geographic temperature average 1961-1990"
TOURISTS= "Human mobility, International inbound tourists, (thousands), 2013"
WOMEN_MPS= "Share of seats in parliament (% held by women), 2014"
;
run;

/*
Compute the Pearson correlation matrix for a subset of the data
forcing only observations with no missing values to be used.
*/
proc corr data=un_plus nomiss cov ;
var EQ_MATH INTERNET ARTICLES R_N_D WOMEN_MPS ;run;
/*
Read the QLF data and perform the analysis
*/
proc IML ;
reset fuzz=1e-10 ; * force IML to print values less than 1e-10 as zeros;
/*
First declare what data set to read
*/
use un_plus(where=(nmiss( EQ_MATH, INTERNET, ARTICLES, R_N_D, WOMEN_MPS) = 0)) ;
/*
Read the variables into a matrix with the name xx and the names of
the countries into a vector called country
Define n as the number of variables and obs as the number of observations.
*/
read all var{EQ_MATH INTERNET ARTICLES R_N_D WOMEN_MPS} into xx [rowname=country];
n = ncol(xx) ; obs = nrow(xx) ;
/*
Compute the correlation and covariance matrix of the data
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
22
*/
cr = standard(xx)`*standard(xx) / (obs - 1 ) ;
cov = (xx-j(obs,1,1)*xx[:,])`* (xx-j(obs,1,1)*xx[:,]) / (obs - 1 ) ;
/*
Check the matrix is defined correctly
*/
print " Correlation matrix" cr, " Covariance matrix" cov ;
/*
Use the eigen routine to compute the vector of eigen values and the matrix made up of
eigen vectors (note in the case of a symmetric matrix such as the correlation matrix
the eigen values are all real.
*/
call eigen(e_values, e_vec, cr) ;
print "Eigen values = " e_values, " Eigenvectors = " e_vec ;
/*
Proportion of total variance explained
*/
VarExp = cusum(e_values)/sum(e_values);
print "Proportion of variance explained by each eigenvalue, " varexp ;
/*
Check to see if the eigenvectors are orthogonal to each other by creating
a matrix of the inner products of each with each other.
*/
chk = j(n,n,0) ;
do ii = 1 to n ;
do jj = 1 to n ;
chk[ii,jj] = (e_vec[,jj]`*e_vec[,ii]) ;
end;
end;
print "Inner products of the Eigenvectors, " chk ;
/*
The Singular Value Decomposition can be defined
by pre and post multiplying a matrix with the eigenvalues on the diagonal.
We can construct this and get back the original matrix.
*/
estimate = e_vec*diag(e_values)*e_vec` ;
diff = cr - estimate ;
print "estimate = " estimate, "original = " cr , "difference = " diff;
/*
Another way of computing the estimate is to use a sumation
*/
est = j(n,n,0) ;
do ii = 1 to n ;
est = est + e_values[ii,]#e_vec[,ii]*e_vec[,ii]` ;
end;
print "Sum of matrix = " est;
/*
Consequently we can use estimates of the correlation matrix that are based on fewer
variables
*/
est1 = e_values[1,]#e_vec[,1]*e_vec[,1]` ;
est2 = est1 + e_values[2,]#e_vec[,2]*e_vec[,2]` ;
est3 = est2 + e_values[3,]#e_vec[,3]*e_vec[,3]` ;
est4 = est3 + e_values[4,]#e_vec[,4]*e_vec[,4]` ;
est5 = est4 + e_values[5,]#e_vec[,5]*e_vec[,5]` ;
/*
We can measure how close each estimate is to the original correlation matrix
by computing the sum of the squares of the differences of all the elements.
*/
d = j(n,1,0) ;
d[1,] = ssq(cr-est1) ;
d[2,] = ssq(cr-est2) ;
d[3,] = ssq(cr-est3) ;
d[4,] = ssq(cr-est4) ;
d[5,] = ssq(cr-est5) ;

J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
23
print "The eigenvalues " (e_values`), "The sum of squared differences" (d`) ;
/*
Construct the principal components as the weighted values of the standardized values of the
original data where the weights are the eigenvectors and the data are converted to z-scores
*/
pc = (standard(xx))*e_Vec;

cor_pc = corr(pc) ;

print "Correlation matrix of PCs " cor_pc ;
/*
create two data sets. One with the principal component scores and the other
with the corresponding names of the countries
*/
pc_names = 'prin1' : 'prin5' ;
create pcs from pc[colname=pc_names ] ; append from pc ;
create names from country[colname="country"] ; append from country;

show all ;
quit;
run;
/*
Combine the names with the component scores
*/
data pcs ; merge names pcs ; run;

proc corr data=pcs ; var prin1-prin5 ; run;
*
Demonstration that a regression on the original data and
one on the equivalent pcs from the covariance matrix will
result in the same fit of the data.
;
proc princomp cov data=un_plus(where=(nmiss( EQ_MATH, INTERNET, ARTICLES, R_N_D, WOMEN_MPS) = 0))
out=pca_out;
var EQ_MATH INTERNET ARTICLES R_N_D WOMEN_MPS;
run;

ods graphics off ;

proc reg data=pca_out;
model LGDP_CAP = EQ_MATH INTERNET ARTICLES R_N_D WOMEN_MPS ;quit;
run;

proc reg data=pca_out;
model LGDP_CAP = prin1-prin5 ;quit;
run;
*
Note that since the pcs are independent of each other we get the same parameter estimates
when we drop some - a property that we do not get with regular regression
;
proc reg data=pca_out;
model LGDP_CAP = prin1-prin3 ;quit;
run;
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
24
6. Analysis of the Australian IO tables

From this tutorial you should have the following takeaways:

o Understand the concepts of the use of Input Output tables.

• To be familiar with the definition of the flow matrix.
• To be to be able to construct a matrix of technical coefficients from a flow table.
• To be familiar with the Leontief inverse and what it can be used for.
• To understand how one can determine the impact of the change in the total production.
• To be able to infer how the changes in total production impact employment.
• To show understand how the Leontief matrix can be used to demonstrate how changes in one sector
influences the demand for many other sector’s in the economy that supply it.
• To be aware of the power of the IO method but also its limitations.
• To learn about the detailed nature of the tables available for the Australian economy.
• To use a program such as Proc IML to generate results from a flow table.
• To be aware of the nature of the sector definitions that are available.

The program AU_IO (as given in the notes) reads the full 2004 Australian Input-Output table to
determine the impact on employment of a change in final demand for the output of different sectors of the
economy.

Using this program determine the direct and indirect impact of the impact of the Covid-19 crises on
the Australian economy by finding the direct and indirect impacts of a 100% drop in output of the
Accommodation, cafes and restaurants (sector #84), the Sport, gambling and recreational services (sector
#107) and the Libraries, museums and the arts (sector #106). Also include the impact of a 50% increase in
the output of the Scientific research, technical and computer services (sector #97) and Health services
(sector #103). (note that the only lines in the program you need to change are identified with “*******;”
in the last columns - these lines are highlighted).

1. Interpret the results of this analysis in terms of direct and indirect impacts. Why do some
sector changes have more impact than others?

2. Re run the analysis separately on just Accommodation, cafes and restaurants (sector #84) and
Health services (sector #103). Now repeat the analysis with these two together. What happens when you
compare these?

3. Discuss the strengths and weaknesses of the results you can obtain with this method.

* AU_IO
Routine to read the 2004 Australian Input Output data set into SAS.
;
Title 2004 Australian IO Table ;
filename csvFile url
"https://www.online.fbe.unimelb.edu.au/t_drive/ECOM/ECOM90020/data/IO_tables_2004.csv"
termstr=crlf;

proc import datafile=csvFile out=flows replace dbms=csv; run;

J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
25
*
Add lables for each variable
;
data flows ; set flows ;
sec_id = _n_ ;
label
c0101 = Sheep
c0102 = Grains
c0103 = Beef cattle
c0104 = Dairy cattle
c0105 = Pigs
c0106 = Poultry
c0107 = Other agriculture
c0200 = Services to agriculture, hunting and trapping
c0300 = Forestry and logging
c0400 = Commercial fishing
c1101 = Coal
c1201 = Oil and gas
c1301 = Iron ores
c1302 = Non-ferrous metal ores
c1400 = Other mining
c1500 = Services to mining
c2101 = Meat and meat products
c2102 = Dairy products
c2103 = Fruit and vegetable products
c2104 = Oils and fats
c2105 = Flour mill products and cereal foods
c2106 = Bakery products
c2107 = Confectionery
c2108 = Other food products
c2109 = Soft drinks, cordials and syrups
c2110 = Beer and malt
c2113 = Wine, spirits and tobacco products
c2201 = Textile fibres, yarns and woven fabrics
c2202 = Textile products
c2203 = Knitting mill products
c2204 = Clothing
c2205 = Footwear
c2206 = Leather and leather products
c2301 = Sawmill products
c2302 = Other wood products
c2303 = Pulp, paper and paperboard
c2304 = Paper containers and products
c2401 = Printing and services to printing
c2402 = Publishing, recorded media, etc.
c2501 = Petroleum and coal products
c2502 = Basic chemicals
c2503 = Paints
c2504 = Medicinal and pharmaceutical products, pesticides
c2505 = Soap and detergents
c2506 = Cosmetics and toiletry preparations
c2507 = Other chemical products
c2508 = Rubber products
c2509 = Plastic products
c2601 = Glass and glass products
c2602 = Ceramic products
c2603 = Cement, lime and concrete slurry
c2604 = Plaster and other concrete products
c2605 = Other non-metallic mineral products
c2701 = Iron and steel
c2702 = Basic non-ferrous metal and products
c2703 = Structural metal products
c2704 = Sheet metal products
c2705 = Fabricated metal products
c2801 = Motor vehicles and parts, other transport equipment
c2802 = Ships and boats
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
26
c2803 = Railway equipment
c2804 = Aircraft
c2805 = Photographic and scientific equipment
c2806 = Electronic equipment
c2807 = Household appliances
c2808 = Other electrical equipment
c2809 = Agricultural, mining, etc. machinery
c2810 = Other machinery and equipment
c2901 = Prefabricated buildings
c2902 = Furniture
c2903 = Other manufacturing
c3601 = Electricity supply
c3602 = Gas supply
c3701 = Water supply, sewerage and drainage services
c4101 = Residential building
c4102 = Other construction
c4201 = Construction trade services
c4501 = Wholesale trade
c4502 = Wholesale mechanical repairs
c4503 = Other wholesale repairs
c5101 = Retail trade
c5102 = Retail mechanical repairs
c5103 = Other retail repairs
c5701 = Accommodation, cafes and restaurants
c6101 = Road transport
c6201 = Rail, pipeline and other transport
c6301 = Water transport
c6401 = Air and space transport
c6601 = Services to transport, storage
c7101 = Communication services
c7301 = Banking
c7302 = Non-bank finance
c7401 = Insurance
c7501 = Services to finance, investment and insurance
c7701 = Ownership of dwellings
c7702 = Other property services
c7801 = Scientific research, technical and computer services
c7802 = Legal, accounting, marketing and business management services
c7803 = Other business services
c8101 = Government administration
c8201 = Defence
c8401 = Education
c8601 = Health services
c8701 = Community services
c9101 = Motion picture, radio and television services
c9201 = Libraries, museums and the arts
c9301 = Sport, gambling and recreational services
c9501 = Personal services
c9601 = Other services
T4 = Industry Uses
Q1 = Households
Q2 = Government
Q3 = Private
Q4 = Public Enterprise
Q5 = General Government
Q6 = Inventories
Q7 = Exports
T5 = (Q1 to Q7)
T6 = Supply
FT_E = Full-time Employees
PT_E = Part-time Employees
FTE_E = Full-time equivalent Employees
FT_P = Full-time Persons Employed
PT_P = Part-time Persons Employed
FTE_P = Full-time equivalent Persons Employed

J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
27
;

run;
*
Create a format to label the rows of the data
;
proc format ; value sector
1 = 'c0101 Sheep '
2 = 'c0102 Grains'
3 = 'c0103 Beef cattle'
4 = 'c0104 Dairy cattle'
5 = 'c0105 Pigs'
6 = 'c0106 Poultry'
7 = 'c0107 Other agriculture'
8 = 'c0200 Services to agriculture, hunting and trapping'
9 = 'c0300 Forestry and logging'
10 = 'c0400 Commercial fishing'
11 = 'c1101 Coal'
12 = 'c1201 Oil and gas'
13 = 'c1301 Iron ores'
14 = 'c1302 Non-ferrous metal ores'
15 = 'c1400 Other mining'
16 = 'c1500 Services to mining'
17 = 'c2101 Meat and meat products'
18 = 'c2102 Dairy products'
19 = 'c2103 Fruit and vegetable products'
20 = 'c2104 Oils and fats'
21 = 'c2105 Flour mill products and cereal foods'
22 = 'c2106 Bakery products'
23 = 'c2107 Confectionery'
24 = 'c2108 Other food products'
25 = 'c2109 Soft drinks, cordials and syrups'
26 = 'c2110 Beer and malt'
27 = 'c2113 Wine, spirits and tobacco products'
28 = 'c2201 Textile fibres, yarns and woven fabrics'
29 = 'c2202 Textile products'
30 = 'c2203 Knitting mill products'
31 = 'c2204 Clothing'
32 = 'c2205 Footwear'
33 = 'c2206 Leather and leather products'
34 = 'c2301 Sawmill products'
35 = 'c2302 Other wood products'
36 = 'c2303 Pulp, paper and paperboard'
37 = 'c2304 Paper containers and products'
38 = 'c2401 Printing and services to printing'
39 = 'c2402 Publishing, recorded media, etc.'
40 = 'c2501 Petroleum and coal products'
41 = 'c2502 Basic chemicals'
42 = 'c2503 Paints'
43 = 'c2504 Medicinal and pharmaceutical products, pesticides'
44 = 'c2505 Soap and detergents'
45 = 'c2506 Cosmetics and toiletry preparations'
46 = 'c2507 Other chemical products'
47 = 'c2508 Rubber products'
48 = 'c2509 Plasticproducts'
49 = 'c2601 Glass and glass products'
50 = 'c2602 Ceramicproducts'
51 = 'c2603 Cement, lime and concrete slurry'
52 = 'c2604 Plaster and other concrete products'
53 = 'c2605 Other non-metallic mineral products'
54 = 'c2701 Iron and steel'
55 = 'c2702 Basic non-ferrous metal and products'
56 = 'c2703 Structural metal products'
57 = 'c2704 Sheet metal products'
58 = 'c2705 Fabricated metal products'
59 = 'c2801 Motor vehicles and parts, other transport equipment'
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
28
60 = 'c2802 Ships and boats'
61 = 'c2803 Railway equipment'
62 = 'c2804 Aircraft'
63 = 'c2805 Photographic and scientific equipment'
64 = 'c2806 Electronic equipment'
65 = 'c2807 Household appliances'
66 = 'c2808 Other electrical equipment'
67 = 'c2809 Agricultural, mining, etc. machinery'
68 = 'c2810 Other machinery and equipment'
69 = 'c2901 Prefabricated buildings'
70 = 'c2902 Furniture'
71 = 'c2903 Other manufacturing'
72 = 'c3601 Electricity supply'
73 = 'c3602 Gas supply'
74 = 'c3701 Water supply, sewerage and drainage services'
75 = 'c4101 Residential building'
76 = 'c4102 Other construction'
77 = 'c4201 Construction trade services'
78 = 'c4501 Wholesale trade'
79 = 'c4502 Wholesale mechanical repairs'
80 = 'c4503 Other wholesale repairs'
81 = 'c5101 Retail trade'
82 = 'c5102 Retail mechanical repairs'
83 = 'c5103 Other retail repairs'
84 = 'c5701 Accommodation, cafes and restaurants'
85 = 'c6101 Road transport'
86 = 'c6201 Rail, pipeline and other transport'
87 = 'c6301 Water transport'
88 = 'c6401 Air and space transport'
89 = 'c6601 Services to transport, storage'
90 = 'c7101 Communication services'
91 = 'c7301 Banking'
92 = 'c7302 Non-bank finance'
93 = 'c7401 Insurance'
94 = 'c7501 Services to finance, investment and insurance'
95 = 'c7701 Ownership of dwellings'
96 = 'c7702 Other property services'
97 = 'c7801 Scientificresearch, technical and computer services'
98 = 'c7802 Legal, accounting, marketing and business management services'
99 = 'c7803 Other business services'
100 = 'c8101 Government administration'
101 = 'c8201 Defence'
102 = 'c8401 Education'
103 = 'c8601 Health services'
104 = 'c8701 Community services'
105 = 'c9101 Motion picture, radio and television services'
106 = 'c9201 Libraries, museums and the arts'
107 = 'c9301 Sport, gambling and recreational services'
108 = 'c9501 Personal services'
109 = 'c9601 Other services'

;
run;

*
Create a data set of the interindustry flows only
;
data inter; set flows ;
if _n_ < 110 ; * use only the first 109 observations ;
keep c0101--c9601; run;
*
Create a data set of the final demand for each sector and total output
;
data final; set flows ;
if _n_ < 110 ; * use only the first 109 observations ;
keep t5 t6 ;
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
29

*
Create a data set of the full-time equivalent employees
;
data fte ; set flows ;
if _n_ < 110 ; * use only the first 109 obs (the rest are missing for this );
keep fte_e ;
run;

*
Read the flow table into IML and create the matrix of technical Coefficients
;
options ps=150 center ;

Proc IML ;
use inter ;
read all into X ; * the inter industry flow matrix ;

use final ;
read all into f ;

tt = f[,2]; * the total output is the 2nd column of f;
fd = f[,1]; * the final demand is the 1st column of f;

use fte ;
read all into emp ;
nsector = nrow(x) ; * Define the number of sectors ;
s_id = (1:nsector)` ; * Define a vector to label the sectors ;
*
Create the technology matrix
;
a = x * diag(1/tt) ; * post multiply by the diagonal matrix of the inverses;
*
Print out the first ten rows and columns to check on the result
;
print "First 10 rows and columns of technology matrix" ,
(s_id[1:10]) [format=sector.] (x[1:10,1:10]), (tt[1:10]), (a[1:10,1:10]) ;
*
Compute the Leontief inverse matrix
;
li_a = inv( i( nsector ) - a ) ; * use the function i() to generate an identity matrix ;
*
Print out the first ten rows and columns of the Leontief inverse to check on the result
;
print "First 10 rows and columns of Leontief Inverse matrix" ,
(s_id[1:10]) [format=sector.], (li_a[1:10,1:10]) ;
*
Compute the average number of FTE Employees per $million of output and print the first
10 rows
;
r = emp/tt ;
print "Average number of FTE Employees per $million of output" ,
(s_id[1:10]) [format=sector.], (r[1:10,1]) ;
*
Estimate the impact of a %change in the demand for output of sectors listed
in vector nn with the proportional changes used in p_change
;
nn = {84 81 }; * The number of the sector to change *******;
p_change = {-1.00 -.50 }; * the proportional changes *******;
ncase = ncol(nn) ;

ff = j(nsector,1,0) ;
ff[nn,1] = tt[nn,1]#p_change` ; * define the proportion of output fall ;

nx = li_a * ff ; * compute the change in the output vector ;
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
30
tot_nx = nx[nn,1] ; * select the industries for the total impact of the change ;
tot_all = sum(nx) ; * Total change in output over all sectors ;
tot_emp = sum(nx#r) ; * compute the total impact of the change in employment ;
sec_emp = r[nn,]#ff[nn,1] ; * sector level results for employment changes ;

print / "Value of Impacts in $1,000" ;
print "Impact of " (100#p_change`) "% change of:" ,
(nn`) [format=sector.] (ff[nn,1]);
print "Total Sector Impact of " , (nn`) [format=sector.] (tot_nx) ,
"With an economy wide impact of" , (sum(nx) );

print / "Impacts in number of Employees" ;
print "Direct number of Employees Sector Impact of " ,(nn`) [format=sector.] (sec_emp),
"Total indirect impact of " (( nx#r )[nn,1]) ,
"With the total Direct Employee impact of", (sum(sec_emp));
print "Economy Wide number of Employees Impact of " , tot_emp;
Print "Thus, for every 1 job loss in these industries there are"
( 1 -(tot_emp/(sum(sec_emp))))" additional job losses over all. " ;

quit;
run;

options ;
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
31
8. Linear Programming

From this tutorial you should have the following takeaways:

• To be able to define the elements of a linear programming problem.
o To specify the objective function as either a max or min.
o To determine the units of the constraint matrix.
o To formulate the constraint matrix in the form of inequalities.
o To realize that the parameters of the problem may not all have meaning.
o To use the fundamental aspect of the non-negativity of the parameters of the problem.
o To be able to interpret the results of the analysis in terms of the objective.
o To be able to interpret the extent to which the constraints are binding or not.
• To estimate the solution of the LP problem.
o To be able to graphically find a solution in the two parameter case.
o To set up the appropriate Proc IML routine to find the solution vector.
o To be able to use a call to an IML routine such as lpsolve
o To use the Excel Solver to find solutions for smaller problems.

Solve the following LP problems.

1. The Electrotech Corporation manufactures two industrial-sized electrical devices: generators and
alternators. Both of these products require wiring and testing during the assembly process. Each generator
requires 2 hours of wiring and 1 hour of testing and can be sold for a $250 profit. Each alternator requires 3
hours of wiring and 2 hours of testing and can be sold for a $150 profit. There are 260 hours of wiring time
and 140 hours of testing time available in the next production period.
a Formulate and find the solution to this problem. (This could be done either with a program or
graphically.)
b Suppose that Electrotech's management decides that they need to make at least 20 generators and at
least 20 alternators. How would you change the problem?
c Suppose that Electrotech can acquire additional wiring time at a very favorable cost. Should it do so?
Why or why not?

2. A car manufacturer in Europe can produce three types of cars: bugs, superbugs and vans.
The profits from these are 1000 for the bug, 1500 for the superbug and 2000 for the van.
It takes 15hrs for the bug engine, 18hrs for the superbug engine, and 20 hrs to build the van engine.
The bodies of these cars take 15hrs for bug, 19hrs for the superbug and 30 for the van.
To assemble the cars takes 10 minutes for the bug, 20 for the superbug and 25 for the van.
The engine works has a capacity of 10,000 labour-hours a week. The body works has 15,000 labour-hours a
week. The assembly line has 168 hours available each week.
a Formulate and solve the LP problem.
b Which shop is worth expanding?

J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
32
9. Data Envelopment Analysis

From this tutorial you should have the following takeaways:

• To understand how the IML routine to compute the DEA result sets up the LP problem to be solved.
• To specify an appropriate DEA model.
• To interpret the results of the routine.
• To interpret the analysis of the returns to scale that can found from DEA.
• To be able to save the results that could be used for input to a post-DEA analysis.
• To understand the different applications that DEA can be applied to.

1. Using the IML program Tute9_DEA.sas and the data on Elect_Gen.csv run a DEA analysis. These
data are for an international panel of electric generating firms.3 The output is measured in megawatts per
month and the capacity is in megawatts per hour, labour in number of full time employees, fuel in dollar
values and cost in dollar values per month. The table below provides the descriptive statistics for the data
used in this tutorial.

Variable Label N Mean Std Dev Minimum Maximum
output Output of Energy (megawatts per month) 131 11721.85 10160.68 248.18 46965.63
capacity Capacity input (megawatts per hour) 131 2769.56 2403.62 86.00 12309.00
labour Labour input (full time employees) 131 617.82 553.17 31.00 3346.13
fuel Fuel input ($) 131 121.11 102.40 3.20 492.59
cost Total costs ($) 131 390.07 339.54 9.78 1920.26

a. Find the set of efficient producers assuming variable returns to scale and input minimizing. Plot the
values using the efficiency score as the vertical axis and the level of output as the horizontal.

b. Find the set of efficient producers assuming constant returns to scale and input minimizing. Plot the
values using the efficiency score as the vertical axis and the level of output as the horizontal.

c. Merge the results of the two cases and plot the difference between the efficiency under the
assumption of variable rates of return and the efficiency under the assumption of constant returns to scale on
the y-axis and the level of output on the x-axis using a log scale for the level of output. From this plot
determine the level of output that may indicate declining returns to scale.

d. As an alternative measure of efficiency compute the ratio of output to cost. Compare this measure to
the measures we have found from the DEA analysis.

* Tute9_DEA.sas

Read a comma-separated values file from the web
of UN cross country quality of life (QLF) indicators.
*/
filename csvFile url
"https://www.online.fbe.unimelb.edu.au/t_drive/ECOM/ECOM90020/data/Elect_Gen.csv"

3 Note that in this routine we define a macro called DEA that allows for the creation of a data sets that have either a 0 or 1
following the names to distinguish whether the analysis was for constant returns or variable returns to scale. The %let statements
allow you to vary the data and type of analysis the DEA performs.
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
33
termstr=crlf;

proc import datafile=csvFile out=gen replace dbms=csv; run;
/*
Set up the sas data set with labels for each variable
*/
data gen1 ; set gen ; length c_number $ 6;
dmu = _n_ ; * create a firm number ;
label
output = Output of Energy (megawatts per month)
capacity = Capacity input (megawatts per hour)
labour = Labour input (full time employees)
Fuel = Fuel input ($)
capp = price of capital ($)
labp = price of labour ($)
fuelp = price of fuel ($)
cost = Total costs ($)
;
run;

proc means data=gen1;
var output capacity labour fuel cost; run;

*-------------------------------------------------------------------------------------------------;
%let outputs = output ; ** List of outputs ;
%let inputs = capacity labour fuel; ** List of inputs ;
%let datause = gen1 ; ** Name of dataset to use ;
%let names = dmu ; ** DMU numbers on the input file ;
%let cnames = name ; ** Alphanumeric DMU name on the input file ;
%let constant = 0 ; ** set to 1 for constant returns to scale problem ;
*-------------------------------------------------------------------------------------------------;
%macro DEA;

Title "DEA of &datause with inputs = &inputs";
Title2 "outputs = &outputs and returns &constant " ;

data &datause; set &datause;
c_n = _n_ ;
array xx &outputs &inputs ;
do over xx ; if xx >= 0 ; end ;
run;
*
Define the Proc IML routine as a macro named DEA
;
proc IML ;
*
Routine to perform the DEA estimation
;
constant = &constant ;
*
To read the data for the inputs and the outputs
;
USE &datause ;

READ ALL var{&outputs} INTO OPt [colname=op_names]; * Outputs ;
READ ALL var{&inputs } INTO INt [colname=in_names]; * Inputs ;
READ ALL var{&names} INTO ID1 [colname=id_names]; * DMU numbers ;
K = NCOL(INt); * Number of inputs ;
M = NCOL(OPt); * Number of outputs ;
*
Set the negative, zero and missing values to zero
;
ind1 = (((opt||int) <= 0)[,+]) > 0 ; * indicator vector for bad values;
op = opt-opt#ind1 ; in = int -int#ind1 ;
*
Run as the input min
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
34
;
cntl = j(8,1,.) ; cntl[1,] = 1 ; * Specifies a cost minimization ;
;
N = NROW(IN); * Number of DMUs ;

print N[colname={"#_of_DMUs"}] k[colname={"#_of_Inputs"}] m[colname={"#_of_Outputs"}];
*
Divide by the means to center the data
;
OP = OP/( J( N,1,1) * OP[:,] );
IN = IN/( J( N,1,1) * IN[:,] );
rel = j(m,1,'G')//j(k,1,'L') // 'E' ; * variable returns (fix) ;
if constant then rel = j(m,1,'G')//j(k,1,'L') ; * constant returns ;
*
Construct a new input vector after the initial efficiency scores
are computed. 1st time through compute the regular results.
;
flag_a = j(n,1,1) ; * define inclusion flag ;
*
Cycle over the dmus for the DEA results
;
if constant = 0 then A = OP`// IN`// J(1,N,1) ; * variable returns ;
else A = OP`// IN` ; * constant returns ;
do ii = 1 to N ;

c = {1} || j(1,N,0) ;

if constant = 0 then aa = ( j(m,1,0) // -in[ii,]`//{0} ) || a ; * variable returns ;
else aa = ( j(m,1,0) // -in[ii,]` ) || a ; * constant returns ;

if constant = 0 then b = op[ii,]` // j(k,1,0) // { 1 } ; * variable returns ;
else b = op[ii,]` // j(k,1,0) ; * constant returns ;
u = j(1,n+1,1) ;

call lpsolve(rc, value ,activ, dual, reducost, c, aa, b, cntl, rel,, ) ;

if value = 0 then eff = . ; else eff = value ;

if rc > 0 then do; print "termination reason = " rc; eff = . ; end;

teff = teff // eff ;

act_tot = act_tot // (id1[ii,]||eff||(activ`)) ;

end;

teff = teff ||j(n,1,constant)|| id1 || op_t || in_t ;

names1 = {"eff&constant" "ifconstant"}||id_names || op_names || in_names ;
names2 = id_names ;
*
Write out the efficiency measures and the full set of estimated weights
;
create teff&constant from teff[colname=names1] ;
setout teff&constant; append from teff ;
create act_tot&constant from act_tot ;
setout act_tot&constant; append from act_tot ;

quit; run;
*
Merge the efficiency data with the original data set
;
data new&constant ; merge &datause teff&constant(in=i1) ; by &names ;
if i1 ; run;
*
Find the efficient generators
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
35
;
proc print data=new&constant(where=(eff&constant=1)) ; id &names ; run;
%mend DEA;
DEA; * run the DEA macro;

J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
36
10. LAV and Quantile Regression.

From this tutorial you should have the following takeaways:

• To understand the operation of the LAV call in IML and how it compares to the estimation of OLS.
• To understand how LAV and OLS compare.
• To interpret the results of a quantile regression routine.
• To generate and interpret the quantile plots in a univariate model.
• To interpret the coefficient specific quantile plots.

1.
a. We read in the data for the electricity generating firms we used in Tute9 and we can model how the
costs per output vary by price of inputs to production. Create new variables that are the log of the prices and
output and use them to regress on the log of the per output cost. Run the regression of these log prices and
log output on the log of the prices and output and examine the result – do we find much difference in the
parameter estimates for the median versus the mean? Plot the residuals using a density plot what do you see
as differences?

b. Use the log per unit cost as the dependent variable and the log of output as the single regressor to see
how the different quantiles have an influence on the log of per unit cost using the proc quantreg routine to
map out how the coefficients of each regressor vary by quantile.

c. Using proc quantreg find how the coefficients for the regressors we used in part a vary by quantile
of the dependent variable. Plot out the coefficients by quantile for quantiles from .1 to .9 by .1.

The routine to perform the first question is listed below.

/* Tute10.sas

Read a comma-separated values file from the web
of the Electric Generators data.
*/
filename csvFile url
"https://www.online.fbe.unimelb.edu.au/t_drive/ECOM/ECOM90020/data/Elect_Gen.csv"
termstr=crlf;

proc import datafile=csvFile out=gen replace dbms=csv; run;
/*
Set up the sas data set with labels for each variable
*/
data gen1 ; set gen ;
dmu = _n_ ; * create a firm number ;
c_p_out = cost/output ;
lcost = log(cost) ;
llabp = log(labp) ;
lfuelp = log(fuelp) ;
lcapp = log(capp) ;
lc_p_out = log(c_p_out) ;
loutput = log(output) ;
lcost = log(cost) ;
if nmiss(of lcapp llabp lfuelp loutput lcost) = 0 ;
label
c_p_out = Cost per output
output = Output of Energy (megawatts per month)
capacity = Capacity input (megawatts per hour)
labour = Labour input (full time employees)
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
37
Fuel = Fuel input ($)
capp = price of capital ($)
labp = price of labour ($)
fuelp = price of fuel ($)
cost = Total costs ($)
;
run;
;*
Adding the LAV to the regression routine we can find both results
;
%let depv= lc_p_out ;
%let regs= lcapp llabp lfuelp loutput ;
Title "LAV and OLS of &depv = ®s";
Proc IML ;

use gen1(where=(&depv^=0)) ; * Read from the gen1 file ;
read all var{&depv} into y ; * Use bakery for the y variable ;
read all var{®s} into x [c=r_names] ; * Use income for the x variable ;
t = nrow(x) ; * Find the number of observations ;
x = j(t,1)||x; * Concatenate a vector of ones to the X matrix ;
r_names = {"intercept"}||r_names ;
k = ncol(x) ; * Find the number of columns ;
xpx = x`*x ; * Define the X`X matrix of cross products ;
ixpx = inv(xpx) ; * Invert the X`X matrix ;
b = ixpx*x`*y ; * Solve the least squares problem ;
sse = (y-x*b)`*(y-x*b) ; * Compute the sum of the squared errors ;
sig = sse/ (t-k) ; * Divide by the number of degrees of freedom ;
se = sqrt(sig*vecdiag(ixpx)); * Define the standard errors of the parameters;
t_stat = b / se ; * Compute the t-statistics for each parameter ;
print "OLS betas",(b||se||t_stat) [c={"Beta" "SE" "t-stat"} r= r_names] ; * Check Results ;
opt_lav = {1000 0 0 0 } ; /* Options for LAV */
call lav(rc, xr, x, y,,opt_lav) ;
if rc ^= 0 then xr[1,2] = . ;
if rc = 1 then xr[2,2] = . ;
print "LAV betas", (xr[1:2,]`||((xr[1,]/xr[2,])`) ) [c={"Beta" "SE" "t-stat"} r= r_names] ;
pred = x*b ; pred = (x-pred) || pred ;
create pr from pred [colname={"r_&depv" "p_&depv"}]; append from pred ;
pred = x*(xr[1,]`) ; pred = (x-pred) || pred ;
create lpr from pred [colname={"rl_&depv" "pl_&depv"}]; append from pred ;
quit;
run;

data new ; merge gen1 pr lpr; run;
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
38
11. Inventory simulation.
• From this tutorial you should have the following takeaways:
o To follow the flow of a simulation.
o To understand how one can use nested do-loops in a data statement to allow for changes in model
parameters.
o To use the rantbl() function to generate a discrete random variable.
o To follow the order of the process simulated in the model.
♦ If the day coincides with the order arrival time, batch of new order arrives. In that case, the
inventory increases by the order amount
♦ demand from customers is drawn from a discrete distribution (demand)
♦ if the demand is lower than the inventory, then reduce inventory by the amount of demand
(demand is all met). Otherwise, reduce inventory to zero. (there is unmet demand)
♦ We also perform the same operation on a variable that captures “inventory position” (inv_p). The
only difference is that, whenever the inventory position is lower than a critical level (order point:
r_p), an order is made, and the inventory position increases by the order amount immediately.
This will prevent multiple orders to be made in consecutive days when the inventory level is very
low.
♦ When an order is made, the order time is drawn from a discrete distribution and it is recorded
(or_time).

o To employ the statistics and graphic routines in SAS to summarize the results of the simulation.

Consider the following inventory problem for a firm that sells TVs.

A Firm sells on average 6 sets per day; however, the actual number can vary from day to day. It has
been determined that the number sold follows the following distribution:

# of Units 0 1 2 3 4 5 6 7 8 9 10
Prob .01 .02 .04 .06 .09 .14 .18 .22 .16 .06 .02

The firm buys the sets from a supplier that can supply the monitors with the following frequency:

Lead time (days) 3 4 5
Prob .2 .6 .2

If we start with an inventory of 50 sets, we can now simulate our inventory and determine the trade-
off between the proportion of demand met and the inventory on hand depending on the order point we use.
The order point is defined as the level of the inventory on hand that triggers a reorder from the supplier.

a) Change the two lines in the code from:

do r_p = 25 to 35 ;
do it = 1 to 1000 ;

to:
do r_p = 38 to 38 ;
do it = 1 to 1 ;

And overlay the plot the value of the sales, the inventory position and inventory by ii to see how the
program simulates the sales level and inventory level for one month right after the invent data set is created.
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
39

b) Change the program back to the original simulation case. Add a count of the number of days the
firm has no inventory – is out-of-stock. Provide a set of side-by-side boxplots of the number of days they
are out-of-stock to the program by Order point. What order point is necessary to ensure that the % of out-of-
stock days is less than 5%?

c) Describe how you could use this model with the cost of holding an inventory to determine the
optimal order point.

*
* inventory_sim.sas

Program to determine the appropriate order point for an inventory problem
;
Title Order point determination ;

data inven;

do r_p = 25 to 35 ; * Use the span of values from 25 to 35 units as the order point ;
do it = 1 to 1000 ; * Rerun the model a number of times ;

nday = 30 ; * Number of days considered ;
inv = 50 ; * starting inventory ;
inv_p = 50 ; * Inventory position ;
n_o = 50 ; * # ordered ;

do ii = 1 to nday ;

b_inv = inv ; * Beginning of day inventory ;
if ii = or_time then inv = inv + 50 ; * if day the orders arrive from supplier ;
*
Define the demand from customers based on probability table
;
demand = rantbl(9897653,.01,.02,.04,.06,.09,.14,.18,.22,.16,.06,.02) - 1 ;
*
Sell the demanded items when there is sufficient inventory or all inventory
;
if demand <= inv then sales = demand; else sales = inv ;
*
Define the unmet sales as the absolute value of the difference between demand and sales
;
unmet_d = abs(sales - demand) ;
*
Remove the items from inventory
;
inv = inv - sales ;
*
Remove items from inventory position
;
inv_p = inv_p - sales ;
*
When inventory goes below reorder point determine when order will arrive.
Add the new items to the inventory position immeadiately.
Note the new items cannot be stocked for one day
;
if inv_p <= r_p then do;
or_time = ii + rantbl(9897653,.2,.6,.2) + 3 ;
inv_p = inv_p + 50 ;
end;

output ;
end; * end of daily loop ;
J. Hirschberg Computational Economics and Business ECON30025/ ECOM90020
40

end; * end of iteration loop ;
end; * end of loop for a particular order point ;

label
r_p = Order Point
nday = Day
inv = Inventory at end of day
b_inv = Inventory at beginning of day
demand = # of Demand
sales = # sold
unmet_d = Unmet demand
inv_p = Inventory Position;

run;
*
Determine the total sales, demand and inventory unit-days for a 30 day period
by order point and iteration
;
proc summary data=inven; by r_p it ; var inv sales demand unmet_d ;
output out= inv_tot sum= ; run;
*
Comptute the service level based on the total sales and demand for the 30 periods
;
data inv_tot; set inv_tot ;
serv_lev = sales / demand ;
p_inv = inv/(30*50) ;
label
serv_lev = Service level
p_inv = Proportion of total Inventory;
*
plot out the distribution of the service levels by order point
;
proc univariate data=inv_tot noprint ; by r_p ; var serv_lev ;
output out=sum_sl median= med mean= mean max=max min=min p1=p1 p5=p5 p95=p95 p99=p99 ; run ;

proc print data=sum_sl ; id r_p ; run;

proc univariate data=inv_tot noprint ; by r_p ; var p_inv ;
output out=sum_inv median= i_med mean= i_mean ; run ;

data sum_tot ; merge sum_inv sum_sl ; by r_p ; run;

proc print data=sum_tot ; id r_p ; var i_med i_mean med mean ; run;

proc sgplot data=inv_tot ;
vbox serv_lev / category=r_p;
run;

proc sgplot data=inv_tot ;
vbox p_inv / category=r_p;
run;