QBUS2310-无代写|学霸联盟

QBUS2310-无代写

时间：2023-04-04

QBUS2310 Management Science
Nam Ho-Nguyen
Discipline of Business Analytics, The University of Sydney
Semester 1, 2023
Nam Ho-Nguyen QBUS2310 Management Science
2/214
What is Management Science?
Management: decision-making to control an organization/entity.
Management science refers to analytical/quantitative techniques
that are used in decision-making processes for organizations (including
businesses). Examples:
▶ mathematical optimization (a.k.a. mathematical programming)
▶ game theory
▶ forecasting
▶ statistics
▶ queueing theory.
The course will focus on mathematical optimization/programming,
arguably the most fundamental tool used for quantitative
decision-making.
Nam Ho-Nguyen QBUS2310 Management Science
3/214
What is mathematical optimization?
Mathematical optimization: a framework to model quantitative
decision-making problems, taking into account:
▶ data – the information that we have;
▶ constraints – which decisions are allowed;
▶ objectives – how good are different decisions.
Optimization: we want to make some quantity as large or as small as
possible, i.e., we optimize the quantity. Examples:
▶ profits
▶ losses
▶ wasted resources
▶ utilities
▶ clashes.
Nam Ho-Nguyen QBUS2310 Management Science
4/214
Three types of analytics
Mathematical optimization falls within prescriptive analytics.
Nam Ho-Nguyen QBUS2310 Management Science
5/214
Case study: Birchbox
▶ The company: a monthly subscription service for beauty products.
Each month, subscribers get a custom box of product samples
tailored to their preferences.
▶ The problem: how to tailor boxes for customers according to
individual preferences?
▶ Complexity: millions of customers each month, 100s of different
possible samples, 5–7 samples each box.
▶ Na¨ıvely, there are ≈ 106 ×
(100
5
)
≈ 7.5× 1013 possible combinations.
▶ Impossible to solve by hand.
Nam Ho-Nguyen QBUS2310 Management Science
6/214
Case study: Birchbox
The solution (part 1):
▶ Group similar customers; from 1 million to ≈ 4000 groups.
▶ Still about ≈ 4000×
(100
5
)
≈ 3× 1011 possible combinations.
▶ More complexity: not every box combination is possible.
▶ value, weight and volume constraints for each box. Product
selections must obey these (e.g., can’t be too heavy/light).
▶ minimum/maximum number of products each box.
▶ certain pairs of products must go together (e.g., face scrub + face
cream); certain pairs of products can’t go together.
▶ certain products can’t be sent to certain groups; certain products
must be shipped to certain groups.
Nam Ho-Nguyen QBUS2310 Management Science
7/214
Case study: Birchbox
The solution (part 2): use mathematical optimize to solve Birchbox’s
assortment problem.
▶ This means we need to define the decision variables, objective and
constraints mathematically. This is called building an
optimization model.
▶ For example, let xpb = 1 if product p is part of configuration b, and
xpb = 0 otherwise; this is a decision variable.
▶ Let P be the set of all products. Then∑p∈P xpb ≤ 5 is a constraint.
▶ Once the model is built, we can solve this using software, such as
Gurobi.
▶ Note: there may be more than one way to build the model. Some
approaches are more advantageous.
Nam Ho-Nguyen QBUS2310 Management Science
8/214
When you should use optimization models
▶ There are large number of variables with complex interdependencies.
▶ The interdependencies between variables and objectives are relatively
well understood.
▶ The problem can be framed quantitatively.
Nam Ho-Nguyen QBUS2310 Management Science
9/214
What you will learn
▶ The components of an optimization model.
▶ How to build linear optimization models for common
decision-making problems.
▶ How to use state-of-the-art software to solve optimization models.
▶ How to model uncertainty and risk in decision-making problems.
Note: the course will focus on linear optimization models and its
variants. It will only briefly touch on algorithms and skip over nonlinear
models.
Nam Ho-Nguyen QBUS2310 Management Science
10/214
How you will learn
Minimum:
▶ Attend lectures and tutorials.
To do well:
▶ Actively engage with lectures, and review coding supplement.
▶ Attempt problems before tutorials, and practice more on your own.
▶ Consult suggested readings for further clarification (if needed).
Nam Ho-Nguyen QBUS2310 Management Science
11/214
Assessment
▶ Three assignments (20% each).
▶ Final exam (40%).
Collaboration policy: For assignments, you are only allowed to verbally
discuss high-level ideas with classmates. All detailed model workings and
coding should be done individually. You are not permitted to consult
resources (e.g., internet forums, websites) except for provided lecture and
tutorial content. For exams, there is strictly no collaboration of any kind.
WARNING: Breaches of this policy will be reported as academic
dishonesty to the University, and may carry severe consequences; see
more information on Canvas.
Nam Ho-Nguyen QBUS2310 Management Science
12/214
Prerequisites and software
Mathematics:
▶ Linear algebra: vector and matrix multiplication.
▶ Comfort with mathematical reasoning.
▶ Comfort with programming using Python.
Software to be used:
▶ Python – freely available.
▶ Gurobi – this is commercial-grade software (e.g., used in the
Birchbox case study), but freely available to students. Instructions
on Canvas.
Nam Ho-Nguyen QBUS2310 Management Science
13/214
Textbooks
There is no official textbook for this course. The slides are
self-contained (except for advanced mathematical proofs, which you
won’t need) and there are plenty of Jupyter demonstrations to
supplement these.
However, the following books may be useful as additional references:
▶ Operations Research: Applications and Algorithms by Wayne L.
Winston.
▶ Applied Mathematical Programming by Stephen P. Bradley, Arnoldo
C. Hax and Thomas L. Magnanti.
▶ Model Building in Mathematical Programming by H. Paul Williams.
▶ Introduction to Linear Optimization by Dimitris Bertsimas and John
N. Tsitsiklis.
▶ Linear Programming by Robert J. Vanderbei.
Warning: the difficulty level, notation, and focus of these books vary,
and will be different to our unit.
Nam Ho-Nguyen QBUS2310 Management Science
14/214
Unit outline
Fundamentals of optimization models
Building linear programming models
What can be modelled with linear programming?
Linear programming technical details
Linear programming duality
Network flow models
Integer programming models
Optimization models under uncertainty
Nam Ho-Nguyen QBUS2310 Management Science
15/214
1. Fundamentals of optimization models
Nam Ho-Nguyen QBUS2310 Management Science
16/214
Common notation
▶ A set S is a collection of objects (numbers, vectors, functions). We
write s ∈ S to indicate that s is from S. For example, S = {0, 1} is
the set consisting of two numbers 0 and 1. We write 0 ∈ S, 1 ∈ S.
▶ Real numbers R. Non-negative real numbers R+ = {r ∈ R : r ≥ 0}.
▶ Integers Z = {. . . ,−2,−1, 0, 1, 2, . . .}. Non-negative integers
Z+ = {0, 1, 2, . . .}. Write 3 ∈ Z, −1 ̸∈ Z+, 1/2 ̸∈ Z.
▶ Given a positive integer n, we will frequently use the shorthand
[n] := {1, 2, . . . , n} to denote the set of all integers from 1 to n,
inclusive.
▶ Let k be a positive integer, and S be a set. Then Sk denotes the set
of length-k vectors x = (x1, . . . , xk) such that each entry xi ∈ S. So
R5 is the set of length-5 vectors with real-valued entries, and Zm is
the set of length-m vectors with integer-valued entries.
Nam Ho-Nguyen QBUS2310 Management Science
17/214
Decisions, constraints and objectives
Every decision-making problem has three components:
▶ Decisions: parameters/variables that we need to determine (e.g.,
the amount to produce, when to schedule, what to do).
▶ Constraints: restrictions that we need to obey (e.g., limits on how
much we spend, limits on how much we produce, demand that we
have to meet).
▶ Objective: what we are trying to achieve (e.g., maximize profits,
minimize costs).
In quantitative decision-making problems these three components
must be framed mathematically/quantitatively:
▶ decisions are variables xj that we have to assign a numerical value.
▶ constraints are functions of variables that we have to build
fi(x) ≤ bi , i.e., variables x must be chosen so that fi(x) ≤ bi .
▶ the objective is a function f (x) that we want to make large or small.
Nam Ho-Nguyen QBUS2310 Management Science
18/214
Mathematical optimization model
An optimization model specifies the task that we wish to execute in
terms of decisions, constraint functions and objective function:
“choose variables x to make f (x) as small/large as possible
under the restrictions that fi(x) ≤ bi for all i ∈ [m]”
Written succinctly:
min
x
/max
x
f (x)
s.t. fi(x) ≤ bi , i ∈ [m].
Note that “fi(x) ≤ bi , i ∈ [m]” means that we need
f1(x) ≤ b1, f2(x) ≤ b2, . . . , fm(x) ≤ bm all to hold.
Nam Ho-Nguyen QBUS2310 Management Science
19/214
Mathematical optimization model
We can write an optimization model in one of two ways:
min
x
/max
x
f (x)
s.t. fi(x) ≤ bi , i ∈ [m]
or
min
x
/max
x
{f (x) : fi(x) ≤ bi , i ∈ [m]} .
We will mostly use the first form, but occasionally use the second one.
Remark: mathematical optimization has another common name,
mathematical programming. This is distinct from computer
programming (although we use it in solving the models). Think of
“programming ≈ planning”.
Nam Ho-Nguyen QBUS2310 Management Science
20/214
Least squares
You may have seen optimization models before.
Least squares:
min
x∈Rn
∥Ax − b∥22 .
▶ Decisions: vectors x ∈ Rn to multiply A by.
▶ Constraints: none! Any vector may be chosen.
▶ Objective: the squared norm distance between Ax and b
∥Ax − b∥22 =
∑
i∈[m]
(a⊤i x − bi)2,
where ai are rows of A.
This course: we study optimization models where constraints and
objectives are linear. (The least squares objective is non-linear.)
Nam Ho-Nguyen QBUS2310 Management Science
21/214
Definitions
Vectors: decision variables are vectors x ∈ Rn, written
x =

x1
x2
...
xn
 or x = (x1, . . . , xn).
Note: when we write (x1, . . . , xn) we interpret this as a column vector.
If we write
[
r1 r2 · · · rk
]
then we are referring to a row vector (but
we will rarely use row vectors).
Each component xi is also called a decision variable; we are overloading
the terminology.
Vectors are also used to store data about our problem: these are fixed
parameters, e.g., b = (b1, . . . , bm).
Nam Ho-Nguyen QBUS2310 Management Science
22/214
Definitions
Matrices: are typically used to store data. An m × n matrix A is written
A =

a11 a12 · · · a1n
a21 a22 · · · a2n
...
... . . .
...
am1 am2 · · · amn
 =
a
⊤
1
...
a⊤m
 = [a˜1 · · · a˜n]
where
▶ entries are double-indexed, e.g., a12.
▶ rows are ai = (ai1, . . . , ain) ∈ Rn. Note that these are column
vectors which are transposed into rows when expressed in A.
▶ columns are a˜j = (a1j , . . . , amj) ∈ Rm.
Some optimization problems arrange decision variables in a matrix.
Nam Ho-Nguyen QBUS2310 Management Science
23/214
Operations
You should be comfortable with the following operations:
▶ scalar multiplication αx , αA.
▶ addition x + y , A+ B (dimensions should be the same).
▶ transpose x⊤, A⊤.
▶ matrix-vector product
Ax =
∑
j∈[n]
a˜jxj =
a
⊤
1 x
...
a⊤mx
 .
▶ matrix-matrix product AB.
▶ Inner product of vectors x⊤y
Nam Ho-Nguyen QBUS2310 Management Science
24/214
Vector norms
Recall the ℓ2-norm:
∥x∥2 =
√
x21 + . . .+ x2n =
√∑
j∈[n]
x2j =
√
x⊤x .
We will consider two other types of norms, the ℓ1-norm and the ℓ∞-norm:
∥x∥1 =
∑
j∈[n]
|xj |
∥x∥∞ = max
j∈[n]
|xj |.
These are just different ways to measure magnitude and distance.
Properties:
∥αx∥ = |α| · ∥x∥
∥x∥ ≥ 0
∥x∥ = 0 ⇐⇒ x = 0
∥x + y∥ ≤ ∥x∥+ ∥y∥.
Nam Ho-Nguyen QBUS2310 Management Science
25/214
Linear functions
We will consider functions f : Rn → R, which means inputs to f are
n-dimensional vectors x ∈ Rn, and outputs f (x) are scalars.
A linear function f (x) is one for which
f (αx + βy) = αf (x) + βf (y)
holds for any two vectors x , y and scalars α, β.
A function f (x) is linear if and only if there exists a fixed vector a for
which
f (x) = a⊤x .
▶ f (x) = x1 + x2 − 3x3 is linear.
▶ f (x) = x1x2 + x23 is nonlinear.
▶ f (x) = max{x1, x2} is nonlinear (but “nice”; we will examine this
more later).
▶ f (x) = exp(x1 + x2) is nonlinear.
Nam Ho-Nguyen QBUS2310 Management Science
26/214
Mathematical optimization definitions
p∗ = min
x∈Rn
f (x)
s.t. fi(x) ≤ bi , i ∈ [m].
▶ fi(x) are constraint functions, bi is known as the right-hand side
of constraint i .
▶ f (x) is the objective function.
▶ Any vector x ∈ Rn which satisfies fi(x) ≤ bi for all i ∈ [m] is called
a feasible solution. If fi(x) > bi for any i , then x is infeasible.
▶ p∗ is known as the optimal value.
▶ Any feasible solution x∗ for which f (x∗) = p∗ is called an optimal
solution. Note: f (x∗) ≤ f (x) for any other feasible solution x .
▶ If there is no feasible solution, the problem is infeasible; by
convention we set p∗ = +∞.
▶ If, for any real number r , there exists a feasible solution xr for which
f (xr ) ≤ r , then the problem is unbounded below, and p∗ = −∞.
Nam Ho-Nguyen QBUS2310 Management Science
27/214
Maximization problems
By convention, we like to write optimization models as minimization.
However, if we have a maximization problem, then we can easily convert
it to a minimization problem:
p∗ = max
x∈Rn
{f (x) : fi(x) ≤ bi , i ∈ [m]}
= − min
x∈Rn
{−f (x) : fi(x) ≤ bi , i ∈ [m]} .
In other words, we can equivalently consider the minimization problem
where the objective function is −f (x).
The main difference is that we say that the problem is unbounded
above when p∗ = +∞ and infeasible when p∗ = −∞.
When context (max or min) is clear, we sometimes lazily say unbounded
instead of unbounded above/unbounded below.
Nam Ho-Nguyen QBUS2310 Management Science
28/214
2. Building linear programming models
Nam Ho-Nguyen QBUS2310 Management Science
29/214
Linear programming models
A generic optimization model:
min
x∈Rn
f (x)
s.t. fi(x) ≤ bi , i ∈ [m].
We say this is a linear programming model if f (x) and fi(x) are linear
functions. Typically, we denote f (x) = c⊤x , fi(x) = a⊤i x . The model
becomes
min
x∈Rn
c⊤x
s.t. a⊤i x ≤ bi , i ∈ [m].
▶ It does not matter if we have ≥ or ≤ constraints:
a⊤i x ≥ bi ⇐⇒ −a⊤i x ≤ −bi .
▶ We can also have equality constraints:
a⊤i x = bi ⇐⇒ a⊤i x ≤ bi & a⊤i x ≥ bi .
Nam Ho-Nguyen QBUS2310 Management Science
30/214
Building LP models
The key thing we want to learn is how to build LP models for real-world
problems. There are three things to specify when building LP models.
▶ Decisions: list the decision variables, and describe what they mean
in the context of the problem.
▶ Constraints: list all restrictions we need to obey (e.g., cannot
produce more than this amount, cannot spend more than this
amount).
▶ Objective: specify what we are trying to achieve (e.g., maximize
profit, minimize cost).
These should be described mathematically.
Nam Ho-Nguyen QBUS2310 Management Science
31/214
Example 1: production planning
A furniture company manufactures two types of products: chairs and
tables. Each product requires certain amounts of raw materials: wood,
metal and plastic. Specifically:
▶ one chair requires 2 units of wood, 0.5 units of metal, and 0.1 units
of plastic.
▶ one table requires 3 units of wood, 0.8 units of metal, and 0.4 units
of plastic.
▶ there are 86 units of wood available, 30 units of metal available, and
9 units of plastic available.
▶ a chair can be sold for $70 profit, and a table can be solved for $110
profit.
Nam Ho-Nguyen QBUS2310 Management Science
32/214
Example 1: production planning
Summary of the data:
requirements chair table availability
wood 2 3 86
metal 0.5 0.8 30
plastic 0.1 0.4 9
profits 70 110
Nam Ho-Nguyen QBUS2310 Management Science
33/214
Example 1: production planning
We specify the three things.
▶ Decisions: xchair, xtable the number of chairs and tables
(respectively) to manufacture.
▶ Constraints:
▶ we need xchair ≥ 0 and xtable ≥ 0.
▶ wood availability constraint 2xchair + 3xtable ≤ 86.
▶ metal availability constraint 0.5xchair + 0.8xtable ≤ 30.
▶ plastic availability constraint 0.1xchair + 0.4xtable ≤ 9.
▶ Objective: maximize profit 70xchair + 110xtable.
Nam Ho-Nguyen QBUS2310 Management Science
34/214
Example 1: production planning
The full linear programming model is
max
x
70xchair + 110xtable
s.t. xchair ≥ 0, xtable ≥ 0
2xchair + 3xtable ≤ 86
0.5xchair + 0.8xtable ≤ 30
0.1xchair + 0.4xtable ≤ 9.
Nam Ho-Nguyen QBUS2310 Management Science
35/214
Example 1: production planning
The optimal solution is x∗chair = 14.8, x∗table = 18.8. See the
supplementary Jupyter notebook.
▶ The notebook shows two ways to implement our model, one bad
and one good.
▶ Bad: very specific to the data.
▶ Good: separates the data and the algebraic structure of the model.
▶ define data first in separate structures (e.g., dictionaries, numpy
arrays).
▶ understand the algebraic structure of the model (we will see this
next).
▶ use for loops (or other things) to implement the model with the
algebraic structure.
Nam Ho-Nguyen QBUS2310 Management Science
36/214
Example 1: production planning
Here is a generic statement of a production problem:
▶ Suppose there are n different products enumerated j = 1, . . . , n.
Each product, when sold, makes profit cj ≥ 0.
▶ Each product requires certain amounts of raw materials to produce.
There are m different raw materials, enumerated i = 1, . . . ,m. The
company has on hand bi amounts of raw material i .
▶ Product j requires aij ≥ 0 amount of raw material i to produce.
These amounts are given to us.
Given this information, how much of each product should the company
produce?
Nam Ho-Nguyen QBUS2310 Management Science
37/214
Example 1: production planning
▶ Decisions: how much of each product to produce, let xj be the
amount for product j .
▶ Constraints: cannot produce negative amounts
xj ≥ 0.
Cannot use more than bi amount of raw material i∑
j∈[n]
aijxj ≤ bi .
▶ Objective: the company wishes to maximize profit∑
j∈[n]
cjxj .
Nam Ho-Nguyen QBUS2310 Management Science
38/214
Example 1: production planning
The LP model is written as
max
x
∑
j∈[n]
cjxj
s.t.
∑
j∈[n]
aijxj ≤ bi ∀i ∈ [m]
xj ≥ 0 ∀j ∈ [n].
We can write this more succinctly by defining ai = (ai1, . . . , ain) for each
i ∈ [n] and c = (c1, . . . , cn). Then the LP is
max
x
c⊤x
s.t. a⊤i x ≤ bi ∀i ∈ [m]
xj ≥ 0 ∀j ∈ [n].
Nam Ho-Nguyen QBUS2310 Management Science
39/214
Example 2: production planning again
Top Brass Trophy Company makes large championship trophies for youth
sports leagues. At the moment, they are planning production for
semester 1 sports: basketball and rugby.
▶ Each basketball trophy has a wood base and an engraved plaque,
and returns $12 in profit.
▶ Rugby trophies are similar except the unit profit is only $9.
▶ The basketball base requires 40cm of wood; the rugby base requires
only 20cm of wood.
▶ At the moment the company has 1750 plaques, and 480m of wood.
What trophies should be produced from these supplies to maximize total
profit assuming that all that are made can be sold?
Nam Ho-Nguyen QBUS2310 Management Science
40/214
Example 2: production planning again
Go through the same steps to get this model
max
x
12xbasketball + 9xrugby
s.t. xbasketball ≥ 0, xrugby ≥ 0
xbasketball + xrugby ≤ 1750
40xbasketball + 20xrugby ≤ 48000.
But this has the exact same structure as Example 1! If we write our code
cleverly, we can simply change the data. See the supplementary Jupyter
notebook.
The optimal solution is x∗basketball = 650, x∗rugby = 1100.
Nam Ho-Nguyen QBUS2310 Management Science
41/214
Remarks
▶ As currently specified, linear programming models allow fractional
values for the decision variables. In Example 2, we got an integer
solution. But this is coincidental; Example 1 has a fractional
solution.
▶ If we want to restrict to integer solutions, we need to specify this as
a constraint. Such models are called integer linear programs, or
integer programs (IPs). We will study these in more details later
in the unit.
▶ LPs are easier to solve than IPs, but the algorithms for IPs involve
solving a series of LPs.
Nam Ho-Nguyen QBUS2310 Management Science
42/214
Example 3: inventory valuation
Previous example: fractional values are allowed for LPs. Sometimes this
does not make sense. But there are many problems where fractional
values are fine.
Consider the same production company that can make n products from
m raw materials.
▶ Each product sells for profit cj ≥ 0, j ∈ [n].
▶ Product j requires aij ≥ 0 amount of raw material i to produce.
▶ The company has on hand bi amounts of raw material i .
How much is the raw material inventory worth to the company?
Nam Ho-Nguyen QBUS2310 Management Science
43/214
Example 3: inventory valuation
▶ Decisions: the marginal value λi of each unit of material i .
▶ Constraints: valuations should be at least 0
λi ≥ 0.
For each product, the cumulative value of the raw materials required
should at least be pj ∑
i∈[m]
aijλi ≥ cj .
▶ Objective: the total value of the inventory is∑
i∈[m]
biλi .
The company wishes to be conservative, thus we try to minimize
this while obeying the constraints.
Nam Ho-Nguyen QBUS2310 Management Science
44/214
Example 3: inventory valuation
The LP model is written as
min
λ
∑
i∈[m]
biλi
s.t.
∑
i∈[m]
aijλi ≥ cj ∀j ∈ [n]
λi ≥ 0 ∀i ∈ [m].
We can write this more succinctly by defining a˜j = (a1j , . . . , amj) for each
j ∈ [m] and b = (b1, . . . , bm). Then the LP is
min
λ
b⊤λ
s.t. a˜⊤j λ ≥ cj ∀j ∈ [n]
λi ≥ 0 ∀i ∈ [m].
Nam Ho-Nguyen QBUS2310 Management Science
45/214
3. What can be modelled with linear programming?
Nam Ho-Nguyen QBUS2310 Management Science
46/214
What can be modelled with LP?
Recall the general optimization model
p∗ = min
x∈Rn
f (x)
s.t. fi(x) ≤ bi , i ∈ [m].
▶ Linear programs require f and fi to be linear.
▶ For certain (very useful) non-linear models, we can convert it into an
LP. For example:
min
x1,x2,x3
max{3x1 − x2, 2max{x2, 4x2 − 3x1} − x1} − 4x1 + x2
s.t. 3x3 ≤ min{x1, 4− |x2 − x3|}
x1 + x2 + x3 = 1
x1 ≤ 0, x2 free, x3 ≥ 0.
Nam Ho-Nguyen QBUS2310 Management Science
47/214
LP modelling Rule 1a: max-type constraints
Suppose we have functions f1, . . . , fℓ, and a constraint of the form
max
k∈[ℓ]
fk(x) ≤ b.
This is equivalent to multiple constraints
fk(x) ≤ b ∀k ∈ [ℓ].
Note: the maximum must be on the “less-than” side!
Nam Ho-Nguyen QBUS2310 Management Science
48/214
LP modelling Rule 1b: min-type constraints
Suppose we have functions f1, . . . , fℓ, and a constraint of the form
min
k∈[ℓ]
fk(x) ≥ b.
This is equivalent to multiple constraints
fk(x) ≥ b ∀k ∈ [ℓ].
Note: the minimum must be on the “greater-than” side!
Nam Ho-Nguyen QBUS2310 Management Science
49/214
LP modelling Rule 2a: sums on the “less-than” side
Suppose we have functions f1, . . . , fℓ, and a constraint of the form∑
k∈[ℓ]
fk(x) ≤ b.
This is equivalent to introducing variables t1, . . . , tℓ and the constraints∑
k∈[ℓ]
tk ≤ b
fk(x) ≤ tk ∀k ∈ [ℓ].
Nam Ho-Nguyen QBUS2310 Management Science
50/214
LP modelling Rule 2b: sums on the “greater-than” side
Suppose we have functions f1, . . . , fℓ, and a constraint of the form∑
k∈[ℓ]
fk(x) ≥ b.
This is equivalent to introducing variables t1, . . . , tℓ and the constraints∑
k∈[ℓ]
tk ≥ b
fk(x) ≥ tk ∀k ∈ [ℓ].
Nam Ho-Nguyen QBUS2310 Management Science
51/214
LP modelling Rule 3a: minimizing objectives
Suppose we have wish to minimize a nonlinear f (x)
min
x
{f (x) : x ∈ X} .
This is equivalent to introducing variable t and solving
min
x ,t
{
t :
f (x) ≤ t
x ∈ X
}
.
We treat f (x)− t ≤ 0 as a constraint and apply the previous rules.
Nam Ho-Nguyen QBUS2310 Management Science
52/214
LP modelling Rule 3b: maximizing objectives
Suppose we have wish to maximize a nonlinear f (x)
max
x
{f (x) : x ∈ X} .
This is equivalent to introducing variable t and solving
max
x ,t
{
t :
f (x) ≥ t
x ∈ X
}
.
We treat f (x)− t ≥ 0 as a constraint and apply the previous rules.
Nam Ho-Nguyen QBUS2310 Management Science
53/214
Common functions used in LP
▶ Absolute value: |r | = max{−r , r}, |r | ≤ s ⇐⇒ −s ≤ r ≤ s.
▶ ℓ1-norm: ∥x∥1 =
∑
j∈[n] |xj | is a sum of absolute values. Can
represent
∥x∥1 ≤ t ⇐⇒ t1 + . . .+ tn ≤ t,−tj ≤ xj ≤ tj ∀j ∈ [n].
▶ ℓ∞-norm: ∥x∥∞ = maxj∈[n] |xj | is a maximum of absolute values.
Can represent
∥x∥∞ ≤ t ⇐⇒ tj ≤ t,−tj ≤ xj ≤ tj ∀j ∈ [n].
Nam Ho-Nguyen QBUS2310 Management Science
54/214
Example 1 from before
Represent the following program as an LP:
min
x1,x2,x3
max{3x1 − x2, 2max{x2, 4x2 − 3x1} − x1} − 4x1 + x2
s.t. 3x3 ≤ min{x1, 4− |x2 − x3|}
x1 + x2 + x3 = 1
x1 ≤ 0, x2 free, x3 ≥ 0.
Nam Ho-Nguyen QBUS2310 Management Science
55/214
Example 1 from before
Process the nonlinear constraint:
3x3 ≤ min{x1, 4− |x2 − x3|}.
▶ Apply Rule 1b to the minimum:
3x3 ≤ min{x1, 4− |x2 − x3|} ⇐⇒
{
3x3 ≤ x1
3x3 ≤ 4− |x2 − x3|.
▶ Rearrange the 2nd inequality: |x2 − x3| ≤ 4− 3x3.
▶ Use the definition of the absolute value:
|x2 − x3| = max{x2 − x3, x3 − x2} ≤ 4− 3x3.
▶ Apply Rule 1a to the maximum:
max{x2 − x3, x3 − x2} ≤ 4− 3x3 ⇐⇒
{
x2 − x3 ≤ 4− 3x3
x3 − x2 ≤ 4− 3x3.
Nam Ho-Nguyen QBUS2310 Management Science
56/214
Example 1 from before
Process the nonlinear constraint:
▶ Put everything together
3x3 ≤ min{x1, 4− |x2 − x3|} ⇐⇒

3x3 ≤ x1
x2 + 2x3 ≤ 4
4x3 − x2 ≤ 4.
▶ Replace the constraint in the program
min
x1,x2,x3
max{3x1 − x2, 2max{x2, 4x2 − 3x1} − x1} − 4x1 + x2
s.t. 3x3 ≤ x1
x2 + 2x3 ≤ 4
4x3 − x2 ≤ 4
x1 + x2 + x3 = 1
x1 ≤ 0, x2 free, x3 ≥ 0.
Nam Ho-Nguyen QBUS2310 Management Science
57/214
Example 1 from before
Process the nonlinear minimizing objective:
▶ Apply Rule 3a to move it to a constraint
min
x1,x2,x3,t
t
s.t. max{3x1 − x2, 2max{x2, 4x2 − 3x1} − x1} − 4x1 + x2 ≤ t
3x3 ≤ x1
x2 + 2x3 ≤ 4
4x3 − x2 ≤ 4
x1 + x2 + x3 = 1
x1 ≤ 0, x2 free, x3 ≥ 0.
Nam Ho-Nguyen QBUS2310 Management Science
58/214
Example 1 from before
Process the nonlinear minimizing objective: converted to a constraint
max{3x1 − x2, 2max{x2, 4x2 − 3x1} − x1} − 4x1 + x2 ≤ t.
▶ Apply Rule 1a to the maximum:
3x1 − x2 − 4x1 + x2 ≤ t
2max{x2, 4x2 − 3x1} − x1 − 4x1 + x2 ≤ t
▶ For the 2nd constraint, push the 2 coefficient to the inside of the
max:
−x1 ≤ t
max{2x2, 8x2 − 6x1} − x1 − 4x1 + x2 ≤ t
▶ Apply Rule 1a again to the maximum:
−x1 ≤ t
2x2 − x1 − 4x1 + x2 ≤ t
8x2 − 6x1 − x1 − 4x1 + x2 ≤ t
Nam Ho-Nguyen QBUS2310 Management Science
59/214
Example 1 from before
Process the nonlinear minimizing objective:
▶ Replace the constraint in the program
min
x1,x2,x3,t
t
s.t. − x1 ≤ t
3x2 − 5x1 ≤ t
9x2 − 11x1 ≤ t
3x3 ≤ x1
x2 + 2x3 ≤ 4
4x3 − x2 ≤ 4
x1 + x2 + x3 = 1
x1 ≤ 0, x2 free, x3 ≥ 0.
Now we’re done, because everything is linear!
Nam Ho-Nguyen QBUS2310 Management Science
60/214
Example 2: variants of regression
▶ Chebyshev approximation: represent the following as an LP
min
x
∥Ax − b∥∞.
▶ ℓ1-approximation: represent the following as an LP
min
x
∥Ax − b∥1.
Nam Ho-Nguyen QBUS2310 Management Science
61/214
Example 3: production planning with holding costs
▶ m resources i ∈ [m], n products j ∈ [n].
▶ Each product j consumes aij units of resource i . Total bi units of
resource i .
▶ Each unit of product j makes cj profit.
▶ Can only sell up to dj units of each product due to demand, i.e., if it
produces too many, it has to pay a holding cost hj ≥ 0 for each unit
of unsold demand. If it produces too little it pays a penalty for
unfulfilled demand pj ≥ 0.
Nam Ho-Nguyen QBUS2310 Management Science
62/214
Example 3: production planning with holding costs
▶ Decisions: how much of each product to produce, let xj be the
amount for product j .
▶ Constraints: cannot produce negative amounts
xj ≥ 0.
Cannot use more than bi amount of raw material i∑
j∈[n]
aijxj ≤ bi .
Also denoted a⊤i x ≤ bi where ai ∈ Rn consists of the requirement aij
of resource i for all products j ∈ [n].
Nam Ho-Nguyen QBUS2310 Management Science
63/214
Example 3: production planning with holding costs
▶ Objective: the company wishes to maximize profits. Let gj(xj) be
the profit from manufacturing xj amount of product j . We wish to
maximize f (x) =
∑
j∈[n] gj(xj).
How to formulate as a linearly representable function?
If xj ≥ dj then gj(xj) = cjdj − hj(xj − dj)
If xj ≤ dj then gj(xj) = cjxj − pj(dj − xj).
▶ When xj ≥ dj , notice that cjdj − hj(xj − dj) ≤ cjxj − pj(dj − xj) so
gj(xj) = min{cjdj − hj(xj − dj), cjxj − pj(dj − xj)}.
▶ When xj ≤ dj , notice that cjdj − hj(xj − dj) ≥ cjxj − pj(dj − xj) so
gj(xj) = min{cjdj − hj(xj − dj), cjxj − pj(dj − xj)}.
▶ In either case
gj(xj) = min{cjdj − hj(xj − dj), cjxj − pj(dj − xj)}.
Nam Ho-Nguyen QBUS2310 Management Science
64/214
Example 3: production planning with holding costs
Modelling the objective f (x) =
∑
j∈[n] gj(xj). This is a maximizing
objective, so we add the constraint f (x) ≥ t and then minimize t.
▶ Apply Rule 2b to the sum: add auxiliary variables t1, . . . , tn and
constraints ∑
j∈[n]
tj ≥ t
gj(xj) ≥ tj ∀j ∈ [n]
▶ Recall gj(xj) = min{cjdj − hj(xj − dj), cjxj − pj(dj − xj)}. Now apply
Rule 1b to each of the minima:∑
j∈[n]
tj ≥ t
cjdj − hj(xj − dj) ≥ tj ∀j ∈ [n]
cjxj − pj(dj − xj) ≥ tj ∀j ∈ [n].
Nam Ho-Nguyen QBUS2310 Management Science
65/214
Example 3: production planning with holding costs
Put everything together:
max
x ,t,t1,...,tn
t
s.t. a⊤i x ≤ bi , i ∈ [m]
x ≥ 0
t ≤
∑
j∈[n]
tj
tj ≤ cjdj − hj(xj − dj), j ∈ [n]
tj ≤ cjxj − pj(dj − xj), j ∈ [n].
Variables tj can be interpreted as the profit (taking into account holding
costs and penalties) of each product j .
Nam Ho-Nguyen QBUS2310 Management Science
66/214
Why do these rules work?
Convince yourself of these simple facts:
▶ For a, b, c ∈ R, we have
max{a, b} ≤ c ⇐⇒ a ≤ c and b ≤ c.
▶ For a, b, c ∈ R, we have
min{a, b} ≥ c ⇐⇒ a ≥ c and b ≥ c.
▶ For a, b ∈ R and c ≥ 0, we have
c min{a, b} = min{ac, bc}, c max{a, b} = max{ac, bc}.
▶ For a, b ∈ R, we have
−min{a, b} = max{−a,−b}, −max{a, b} = min{−a,−b}.
▶ For a1, . . . , an, b ∈ R, if a1 + . . . + an ≤ b then we can find t1, . . . , tn such that
tj ≥ aj ∀j ∈ [n]
t1 + . . . + tn ≤ t
▶ For a1, . . . , an, b ∈ R, if a1 + . . . + an ≥ b then we can find t1, . . . , tn such that
tj ≤ aj ∀j ∈ [n]
t1 + . . . + tn ≥ t
Nam Ho-Nguyen QBUS2310 Management Science
67/214
Technical details: linearly representable functions
Definition
The epigraph of a function f (x) is the set
epi(f ) := {(x , t) : f (x) ≤ t} ⊆ Rn+1.
Definition
A function f (x) is linearly representable if the epigraph can be written
with linear inequalities
epi(f ) = {(x , t) : f (x) ≤ t}
=
{
(x , t) : ∃y s.t. c⊤k x + d⊤k y + gkt ≤ hk , ∀k ∈ [ℓ]
}
.
Here ck , dk are vectors and gk , hk are scalars. Variables y are also
vectors, they are called auxiliary variables. We want (x , y , t) to satisfy
the linear inequalities, but (x , t) are the variables that we use.
Nam Ho-Nguyen QBUS2310 Management Science
68/214
Technical details: linearly representable functions
Theorem
If an optimization model minx {f (x) : fi(x) ≤ bi , i ∈ [m]} contains only
linearly representable functions, then it can be recast as an LP.
Proof. The constraint f (x) ≤ b can be written as
f (x) ≤ b ⇐⇒ f (x) ≤ t, t = b
⇐⇒ c⊤k x + d⊤k y + gkb ≤ hk , ∀k ∈ [ℓ].
A minimizing objective f (x) can be transformed as
min
x
{f (x) : Ax ≤ b} = min
x ,t
{t : f (x) ≤ t,Ax ≤ b}
= min
x ,y ,t
{
t : c
⊤
k x + d⊤k y + gkt ≤ hk ∀k ∈ [ℓ]
Ax ≤ b
}
.
Constraints f (x) ≥ b and maximizing objectives f (x) can be modelled as
long as −f (x) is linearly representable.
Nam Ho-Nguyen QBUS2310 Management Science
69/214
Technical details: linearly representable functions
Which functions are linearly representable?
Theorem
Let f1(x), . . . , fℓ(x) be linearly representable functions. Then
▶ linear functions c⊤x are linearly representable.
▶ adding a constant fk(x) + b is linearly representable.
▶ for any non-negative scalars α1, . . . , αℓ ≥ 0, the function
f (x) =
∑
k∈[ℓ]
αk fk(x)
is linearly representable.
▶ The maximum of the functions
f (x) = max
k∈[ℓ]
fk(x)
is linearly representable.
Therefore, we can recognise linearly representable functions through these
operations.
Nam Ho-Nguyen QBUS2310 Management Science
70/214
4. Linear programming technical details
Nam Ho-Nguyen QBUS2310 Management Science
71/214
Matrix form
Consider the constraints a⊤i x ≤ bi for i ∈ [m]. Let A =
a
⊤
1
...
a⊤m
 be the
matrix with rows ai and b = (b1, . . . , bm) be the (column) vector with
entries bi . Then a⊤i x are exactly the entries of Ax .
Notation: Ax ≤ b means every entry of Ax is ≤ the corresponding entry
of b, i.e., a⊤i x ≤ bi for each i ∈ [m]. We define Ax ≥ b, Ax = b
analogously.
▶ More generally, when r and s are vectors of the same size, r ≥ s
states that every entry of r is ≥ the corresponding entry of s.
▶ E.g., x ≥ 0 states that every entry of x must be non-negative.
The LP can be written succinctly as
min
x
c⊤x
s.t. Ax ≤ b.
Nam Ho-Nguyen QBUS2310 Management Science
72/214
Standard form
Three types of linear constraints:
▶ a⊤i x ≤ bi
▶ a⊤i x = bi
▶ a⊤i x ≥ bi .
The ≤ and ≥ types can be converted into = by adding a non-negative
slack variable:
a⊤i x ≤ bi ⇐⇒ a⊤i x + si = bi , si ≥ 0
a⊤i x ≥ bi ⇐⇒ a⊤i x − si = bi , si ≥ 0.
Any free variable xj can be converted into two non-negative variables
with an equality constraint:
xj = x+j − x−j , x+j ≥ 0, x−j ≥ 0.
Nam Ho-Nguyen QBUS2310 Management Science
73/214
Standard form
Therefore any linear program
min
x
c⊤x
s.t. a⊤i x ≤ bi , i ∈ L
a⊤i x = bi , i ∈ E
a⊤i x ≥ bi , i ∈ G
can be converted into a standard form
min
x+,x−,s
c⊤(x+ − x−)
s.t. a⊤i (x+ − x−) + si = bi , i ∈ L
a⊤i (x+ − x−) = bi , i ∈ E
a⊤i (x+ − x−)− si = bi , i ∈ G
x+, x−, sL, sG ≥ 0.
Nam Ho-Nguyen QBUS2310 Management Science
74/214
Standard form in matrix form
In matrix form:
min
x+,x−,s
c⊤(x+ − x−)
s.t.
AL −AL I 0AE −AE 0 0
AG −AG 0 −I


x+
x−
sL
sG
 =
bLbE
bG

x+, x−, sL, sG ≥ 0.
Nam Ho-Nguyen QBUS2310 Management Science
75/214
Standard form
If we redefine things appropriately, we can write this as
min
x¯
c¯⊤x¯
s.t. A¯x¯ = b¯
x¯ ≥ 0.
Standard form is primarily for mathematical convenience to discuss
theoretical ideas, such as duality.
▶ When modelling a problem and implementing in Gurobi, we do not
need to convert it.
▶ The point is that any LP can be recast into standard form (if
needed).
Nam Ho-Nguyen QBUS2310 Management Science
76/214
Feasibility problems
Suppose we want to solve the following:
find x such that Ax = b, x ≥ 0.
This is a special case of an LP where c = 0:
p∗ = min
x
0
s.t. Ax = b
x ≥ 0.
▶ if there exists such a vector x , then p∗ = 0, and solving the LP will
find such an x .
▶ if there does not exist an x , then p∗ =∞.
Nam Ho-Nguyen QBUS2310 Management Science
77/214
Geometric interpretation
Let’s consider
min
x
c⊤x
s.t. a⊤i x ≤ bi , i ∈ [m]
What does a linear program “look” like?
▶ Let Hi :=
{
x : a⊤i x ≤ bi
}
be all vectors x which satisfy the
constraint a⊤i x ≤ bi ; Hi is called a halfspace.
x1
x2
Nam Ho-Nguyen QBUS2310 Management Science
78/214
Geometric interpretation
The set of all feasible solutions of an LP is then an intersection of
halfspaces, called a polyhedron. Two examples, in 2 and 3 dimensions:
P
a1 a2
a3
a4
a5
Nam Ho-Nguyen QBUS2310 Management Science
79/214
Geometric interpretation
Where are the optimal solutions in the polyhedron? Let’s look at a
concrete example in two dimensions x = (x1, x2):
max
x
c⊤x
s.t. 2x1 + 3x2 ≤ 1
9x1 + 5x2 ≤ 3
x1 ≥ 0, x2 ≥ 0
5x1 + 15x2 ≥ 6.
c⊤x = 0.1 c⊤x = 0.3 c⊤x = 0.5c⊤x = 7/17 ≈ 0.41
x∗ = (4/17, 3/17)c⊤x = 4/3 ≈ 1.33
x∗ = (0, 1/3)
c⊤x = 1
x1
x2
Nam Ho-Nguyen QBUS2310 Management Science
79/214
Geometric interpretation
Where are the optimal solutions in the polyhedron? Let’s look at a
concrete example in two dimensions x = (x1, x2):
max
x
c⊤x = x1 + x2
s.t. 2x1 + 3x2 ≤ 1
9x1 + 5x2 ≤ 3
x1 ≥ 0, x2 ≥ 0
5x1 + 15x2 ≥ 6.
c⊤x = 0.1 c⊤x = 0.3 c⊤x = 0.5
c⊤x = 7/17 ≈ 0.41
x∗ = (4/17, 3/17)c⊤x = 4/3 ≈ 1.33
x∗ = (0, 1/3)
c⊤x = 1
x1
x2
Nam Ho-Nguyen QBUS2310 Management Science
79/214
Geometric interpretation
Where are the optimal solutions in the polyhedron? Let’s look at a
concrete example in two dimensions x = (x1, x2):
max
x
c⊤x = x1 + x2
s.t. 2x1 + 3x2 ≤ 1
9x1 + 5x2 ≤ 3
x1 ≥ 0, x2 ≥ 0
5x1 + 15x2 ≥ 6.
c⊤x = 0.1 c⊤x = 0.3 c⊤x = 0.5
c⊤x = 7/17 ≈ 0.41
x∗ = (4/17, 3/17)
c⊤x = 4/3 ≈ 1.33
x∗ = (0, 1/3)
c⊤x = 1
x1
x2
Nam Ho-Nguyen QBUS2310 Management Science
79/214
Geometric interpretation
Where are the optimal solutions in the polyhedron? Different objectives
make the optimal solution switch to a different vertex:
max
x
c⊤x = x1 + 4x2
s.t. 2x1 + 3x2 ≤ 1
9x1 + 5x2 ≤ 3
x1 ≥ 0, x2 ≥ 0
5x1 + 15x2 ≥ 6.
c⊤x = 0.1 c⊤x = 0.3 c⊤x = 0.5c⊤x = 7/17 ≈ 0.41
x∗ = (4/17, 3/17)
c⊤x = 4/3 ≈ 1.33
x∗ = (0, 1/3)
c⊤x = 1
x1
x2
Nam Ho-Nguyen QBUS2310 Management Science
79/214
Geometric interpretation
Where are the optimal solutions in the polyhedron? Multiple optimal
solutions are also possible:
max
x
c⊤x = 2x1 + 3x2
s.t. 2x1 + 3x2 ≤ 1
9x1 + 5x2 ≤ 3
x1 ≥ 0, x2 ≥ 0
5x1 + 15x2 ≥ 6.
c⊤x = 0.1 c⊤x = 0.3 c⊤x = 0.5c⊤x = 7/17 ≈ 0.41
x∗ = (4/17, 3/17)c⊤x = 4/3 ≈ 1.33
x∗ = (0, 1/3)
c⊤x = 1
x1
x2
Nam Ho-Nguyen QBUS2310 Management Science
79/214
Geometric interpretation
Where are the optimal solutions in the polyhedron? Infeasible
problems happen when we impose constraints that don’t intersect:
max
x
c⊤x
s.t. 2x1 + 3x2 ≤ 1
9x1 + 5x2 ≤ 3
x1 ≥ 0, x2 ≥ 0
5x1 + 15x2 ≥ 6.
c⊤x = 0.1 c⊤x = 0.3 c⊤x = 0.5c⊤x = 7/17 ≈ 0.41
x∗ = (4/17, 3/17)c⊤x = 4/3 ≈ 1.33
x∗ = (0, 1/3)
c⊤x = 1
x1
x2
Nam Ho-Nguyen QBUS2310 Management Science
80/214
Algorithms
The main algorithm to solve LPs is the simplex method.
▶ Idea is to walk from vertex to vertex in the feasible region, always
improving objective. Stop when no improvement is possible.
▶ Can be computed by hand using basic linear algebra for small size
problems.
▶ Software automates this to solve large problems quickly.
▶ Another class of algorithms commonly used are called interior point
methods.
In this unit we do not study algorithms to solve LPs.
▶ Software is already highly developed.
▶ Knowledge of LP algorithms does not help with modelling (the
majority of the time).
Nam Ho-Nguyen QBUS2310 Management Science
81/214
5. Linear programming duality
Nam Ho-Nguyen QBUS2310 Management Science
82/214
Upper bounds for an LP
p∗ := min
x
4x1 + x2 + 3x3
s.t. 2x1 + 4x2 = 1
3x1 − x2 + x3 = 4
x1, x2, x3 ≥ 0.
How can we derive upper bounds for p∗?
▶ Simple: find any feasible (x1, x2, x3), then p∗ ≤ 4x1 + x2 + 3x3.
Nam Ho-Nguyen QBUS2310 Management Science
83/214
Lower bounds for an LP
What about lower bounds?
▶ Multiply 2x1 + 4x2 = 1 by λ1 and 3x1 − x2 + x3 = 4 by λ2, sum and
collect xj together:
λ1 + 4λ2 = λ1(2x1 + 4x2) + λ2(3x1 − x2 + x3)
= (2λ1 + 3λ2)x1 + (4λ1 − λ2)x2 + λ2x3.
▶ Choose λ1, λ2 to satisfy
2λ1 + 3λ2 ≤ 4 = c1, 4λ1 − λ2 ≤ 1 = c2, λ2 ≤ 3 = c3.
▶ Then since x1, x2, x3 ≥ 0, for any feasible (x1, x2, x3) we have
(2λ1 + 3λ2)x1 ≤ 4x1, (4λ1 − λ2)x2 ≤ x2, λ2x3 ≤ 3x3,
hence summing these
(2λ1 + 3λ2)x1 + (4λ1 − λ2)x2 + λ2x3 ≤ 4x1 + x2 + 3x3.
Nam Ho-Nguyen QBUS2310 Management Science
84/214
Lower bounds for an LP
▶ Therefore 2x1 + 4x2 = 13x1 − x2 + x3 = 4
x1, x2, x3 ≥ 0
and
2λ1 + 3λ2 ≤ 4
4λ1 − λ2 ≤ 1
λ2 ≤ 3

=⇒ λ1 + 4λ2 ≤ 4x1 + x2 + 3x3.
▶ This holds for any feasible x , so λ1 + 4λ2 ≤ p∗.
Nam Ho-Nguyen QBUS2310 Management Science
85/214
Upper and lower bounds
▶ Upper bounds
any x s.t.
2x1 + 4x2 = 1
3x1 − x2 + x3 = 4
x1, x2, x3 ≥ 0
=⇒ p∗ ≤ 4x1 + x2 + 3x3.
▶ Lower bounds
any λ s.t.
2λ1 + 3λ2 ≤ 4
4λ1 − λ2 ≤ 1
λ2 ≤ 3
=⇒ p∗ ≥ λ1 + 4λ2.
▶ Choose (x1, x3, x3) = (0, 1/4, 17/4) to get p∗ ≤ 13.
▶ Choose (λ1, λ2) = (0, 3/4) to get p∗ ≥ 3.
▶ Therefore we deduce 3 ≤ p∗ ≤ 13.
Nam Ho-Nguyen QBUS2310 Management Science
86/214
Upper and lower bounds
▶ Upper bounds
any x s.t.
2x1 + 4x2 = 1
3x1 − x2 + x3 = 4
x1, x2, x3 ≥ 0
=⇒ p∗ ≤ 4x1 + x2 + 3x3.
▶ Lower bounds
any λ s.t.
2λ1 + 3λ2 ≤ 4
4λ1 − λ2 ≤ 1
λ2 ≤ 3
=⇒ p∗ ≥ λ1 + 4λ2.
▶ Choose (x1, x3, x3) = (1/2, 0, 5/2) to get p∗ ≤ 19/2.
▶ Choose (λ1, λ2) = (−5/2, 3) to get p∗ ≥ 19/2.
▶ Therefore we can conclude p∗ = 19/2.
Nam Ho-Nguyen QBUS2310 Management Science
87/214
Best lower bounds
What is the best lower bound we can generate?
d∗ := max
λ
λ1 + 4λ2
s.t. 2λ1 + 3λ2 ≤ 4
4λ1 − λ2 ≤ 1
λ2 ≤ 3.
This is also an LP! We call this the dual LP, and we call the original the
primal LP.
Nam Ho-Nguyen QBUS2310 Management Science
88/214
General LPs
Suppose we have a completely general LP:
min
x
c⊤x
s.t. a⊤i x ≤ bi , i ∈ ML
a⊤i x = bi , i ∈ ME
a⊤i x ≥ bi , i ∈ MG
xj ≤ 0, j ∈ NL
xj free, j ∈ NF
xj ≥ 0, j ∈ NG
where ML,ME ,MG ⊂ [m] and NL,NF ,NG ⊂ [n] are index partitions. How
do we formulate the dual?
▶ We use a procedure called constraint aggregation.
Nam Ho-Nguyen QBUS2310 Management Science
89/214
Deriving the dual via constraint aggregation
Step 1: assign dual variables to each constraint, except sign constraints.
min
x
c⊤x
s.t. a⊤i x ≤ bi , i ∈ ML (λi ≤ 0)
a⊤i x = bi , i ∈ ME (λi free)
a⊤i x ≥ bi , i ∈ MG (λi ≥ 0)
xj ≤ 0, j ∈ NL
xj free, j ∈ NF
xj ≥ 0, j ∈ NG
Signs of dual variables should be the same as inequality directions.
Primal equality constraints correspond to free dual variables.
Nam Ho-Nguyen QBUS2310 Management Science
90/214
Deriving the dual via constraint aggregation
Step 2a: aggregate constraints via dual variables, i.e., multiply and sum.
Remember to respect the sign.
λia⊤i x ≥ biλi , i ∈ ML (λi ≤ 0)
λia⊤i x = biλi , i ∈ ME (λi free)
λia⊤i x ≥ biλi , i ∈ MG (λi ≥ 0),
(Notice that signs of λi were chosen so that all the inequalities point in
the same direction.)∑
i∈ML
biλi +
∑
i∈ME
biλi +
∑
i∈MG
biλi ≤
∑
i∈ML
λia⊤i x +
∑
i∈ME
λia⊤i x +
∑
i∈MG
λia⊤i x
=
(∑
i∈ML
λiai +
∑
i∈ME
λiai +
∑
i∈MG
λiai
)⊤
x
Nam Ho-Nguyen QBUS2310 Management Science
91/214
Deriving the dual via constraint aggregation
Step 2b: for convenience we’ll simplify notation (not always necessary).
Let A be the m × n matrix with rows a1, . . . , am. Note that each ai is a
row of A. Let b be the vector with entries b1, . . . , bm.∑
i∈ML
biλi +
∑
i∈ME
biλi +
∑
i∈MG
biλi ≤
∑
i∈ML
λia⊤i x +
∑
i∈ME
λia⊤i x +
∑
i∈MG
λia⊤i x
=

∑
i∈ML
λiai +
∑
i∈ME
λiai +
∑
i∈MG
λiai︸︷︷︸
A⊤λ

⊤
x
⇐⇒ b⊤λ ≤ (A⊤λ)⊤x
Nam Ho-Nguyen QBUS2310 Management Science
92/214
Deriving the dual via constraint aggregation
Step 3a: match with the primal objective vector. We have
We have b⊤λ ≤ (A⊤λ)⊤x , but we want b⊤λ ≤ (A⊤λ)⊤x ≤ c⊤x .
To guarantee this, entries of A⊤λ must be matched with entries of c.
▶ What are entries of A⊤λ? Let the columns of A be denoted by
a˜1, . . . , a˜n. Then entry j of A⊤λ is a˜⊤j λ.
▶ Therefore
(A⊤λ)⊤x =
∑
i∈[n]
(a˜⊤j λ)xj =
∑
j∈NL
(a˜⊤j λ)xj +
∑
j∈NF
(a˜⊤j λ)xj +
∑
j∈NG
(a˜⊤j λ)xj
Nam Ho-Nguyen QBUS2310 Management Science
93/214
Deriving the dual via constraint aggregation
Step 3b: match with the primal objective vector. We match (inequality
opposite to sign of xj)
∀j ∈ NL : a˜⊤j λ ≥ cj , xj ≤ 0 =⇒ (a˜⊤j λ)xj ≤ cjxj
∀j ∈ NF : a˜⊤j λ = cj , xj free =⇒ (a˜⊤j λ)xj = cjxj
∀j ∈ NL : a˜⊤j λ ≤ cj , xj ≥ 0 =⇒ (a˜⊤j λ)xj ≤ cjxj .
This implies
(A⊤λ)⊤x =
∑
j∈NL
(a˜⊤j λ)xj +
∑
j∈NF
(a˜⊤j λ)xj +
∑
j∈NG
(a˜⊤j λ)xj
≤
∑
j∈NL
cjxj +
∑
j∈NF
cjxj +
∑
j∈NG
cjxj
= c⊤x .
Nam Ho-Nguyen QBUS2310 Management Science
94/214
Deriving the dual via constraint aggregation
Step 4: summarize the dual information.
d∗ = max
λ
b⊤λ
s.t. λi ≤ 0, i ∈ ML
λi free, i ∈ ME
λi ≥ 0, i ∈ MG
a˜⊤j λ ≥ cj , j ∈ NL
a˜⊤j λ = cj , j ∈ NF
a˜⊤j λ ≤ cj , j ∈ NG
vs
p∗ = min
x
c⊤x
s.t. a⊤i x ≤ bi , i ∈ ML
a⊤i x = bi , i ∈ ME
a⊤i x ≥ bi , i ∈ MG
xj ≤ 0, j ∈ NL
xj free, j ∈ NF
xj ≥ 0, j ∈ NG .
▶ A constraint of the primal corresponds to a variable in the dual.
▶ A variable in the primal corresponds to a constraint of the dual.
Nam Ho-Nguyen QBUS2310 Management Science
95/214
Exercise
Derive the duals for the following LPs:
min
x
x1 + 4x2
s.t. 2x1 + 3x2 ≤ 1
9x1 + 5x2 = 3
x1 ≥ 0, x2 ≤ 0
min
x
x1 + 4x2
s.t. 2x1 + 3x2 ≥ 1
9x1 + 5x2 ≤ 3
x1 ≤ 0, x2 ≤ 0.
max
x
x1 + 4x2
s.t. 2x1 + 3x2 ≤ 1
9x1 + 5x2 ≤ 3
x1 ≥ 0, x2 ≥ 0
How do we take the dual of the last maximizing program?
1. Reverse the signs of the dual variables, e.g., 2x1 + 3x2 ≤ 1 gets
assigned λ1 ≥ 0 instead. Then, create an upper (instead of lower)
bound for the objective.
The dual of a maximizing program is a minimizing program.
Nam Ho-Nguyen QBUS2310 Management Science
96/214
Dual of a maximizing program
d∗ = min
λ
b⊤λ
s.t. λi ≥ 0, i ∈ ML
λi free, i ∈ ME
λi ≤ 0, i ∈ MG
a˜⊤j λ ≤ cj , j ∈ NL
a˜⊤j λ = cj , j ∈ NF
a˜⊤j λ ≥ cj , j ∈ NG
vs
p∗ = max
x
c⊤x
s.t. a⊤i x ≤ bi , i ∈ ML
a⊤i x = bi , i ∈ ME
a⊤i x ≥ bi , i ∈ MG
xj ≤ 0, j ∈ NL
xj free, j ∈ NF
xj ≥ 0, j ∈ NG .
▶ A constraint of the primal corresponds to a variable in the dual.
▶ A variable in the primal corresponds to a constraint of the dual.
Nam Ho-Nguyen QBUS2310 Management Science
97/214
Matrix algebra exercise
Define
A =
ALAE
AG
 = [A˜L A˜F A˜G] , b =
bLbE
bG
 , c =
cLcF
cG
 , x =
xLxF
xG

where
▶ AL is the matrix with rows ai for i ∈ ML, AE is the matrix with
columns ai for i ∈ ME , AG is the matrix with columns ai for i ∈ MG ;
and analogous definitions for b.
▶ A˜L is the matrix with columns a˜j for j ∈ NL, A˜F is the matrix with
columns a˜j for j ∈ NF , A˜G is the matrix with columns a˜j for j ∈ NG ;
and analogous definitions for c and x .
Write the primal general LP in terms of these definitions, and derive its
dual.
Nam Ho-Nguyen QBUS2310 Management Science
98/214
Matrix algebra exercise
Step 1: assign dual variables to each constraint, except sign constraints.
Write the primal as
min
x
c⊤L xL + c⊤F xF + c⊤G xG
s.t. ALx ≤ bL, (λL ≤ 0)
AEx = bE , (λE free)
A⊤G x ≥ bG , (λG ≥ 0)
xL ≤ 0,
xF free,
xG ≥ 0.
Note that xL, xF , xG and λL, λE , λG are vectors.
Nam Ho-Nguyen QBUS2310 Management Science
99/214
Matrix algebra exercise
Step 2: aggregate constraints via dual variables, i.e., multiply and sum.
(A⊤L λL)⊤x ≥ b⊤L λL, (A⊤F λF )⊤x = b⊤F λF , (A⊤G λG)⊤x ≥ b⊤G λG
=⇒ (A⊤L λL + A⊤F λF + A⊤G λG)⊤ x ≥ b⊤L λL + b⊤F λF + b⊤G λG
Let λ =
λLλE
λG
. Then
A⊤λ = A⊤L λL + A⊤F λF + A⊤G λG , b⊤L λL + b⊤F λF + b⊤G λG = b⊤λ.
We thus have
(A⊤λ)⊤x ≥ b⊤λ.
Nam Ho-Nguyen QBUS2310 Management Science
100/214
Matrix algebra exercise
Step 3: match with the primal objective vector. Notice that
(A⊤λ)⊤x = λ⊤Ax = λ⊤
[
A˜L A˜F A˜G
] xLxF
xG

= λ⊤
(
A˜LxL + A˜F xF + A˜GxG
)
= (A˜⊤L λ)⊤xL + (A˜⊤F λ)⊤xF + (A˜⊤G λ)⊤xG
We match
A˜⊤L λ ≥ cL, A˜⊤F λ = cF , A˜⊤G λ ≤ cG
to ensure
(A⊤λ)⊤x = (A˜⊤L λ)⊤xL+(A˜⊤F λ)⊤xF +(A˜⊤G λ)⊤xG ≤ c⊤L xL+c⊤F xF +c⊤G xG .
Nam Ho-Nguyen QBUS2310 Management Science
101/214
Matrix algebra exercise
Step 4: summarize the dual information.
d∗ = max
λ
b⊤L λL + b⊤F λF + b⊤G λG
s.t. λL ≤ 0,
λE free,
λG ≥ 0,
A˜⊤L λ ≥ cL,
A˜⊤F λ = cF ,
A˜⊤G λ ≤ cG
p∗ = min
x
c⊤L xL + c⊤F xF + c⊤G xG
s.t. ALx ≤ bL,
AEx = bE ,
A⊤G x ≥ bG ,
xL ≤ 0,
xF free,
xG ≥ 0.
Nam Ho-Nguyen QBUS2310 Management Science
102/214
Exercise
Derive the duals for the following LPs:
min
x
{
c⊤x : Ax = b, x ≥ 0}
min
x
{
c⊤x : Ax ≥ b, x ≥ 0}
max
x
{
c⊤x + d⊤z : Ax + Bz ≤ b, x ≥ 0, λ free}
Hint: rather than simply applying the formula from the previous exercise,
try and derive it systematically following the same steps.
Nam Ho-Nguyen QBUS2310 Management Science
103/214
Duality theorems
In the results below we use the convention that the primal is the
“minimization” problem, and the dual is the “maximization” (but in
general these are interchangeable).
Theorem (Weak duality)
Let p∗ and d∗ be the primal and dual optimal values. Then p∗ ≥ d∗.
Consequently:
▶ if the primal is unbounded below, then p∗ = d∗ = −∞, and the dual
is infeasible.
▶ if the dual is unbounded above, then p∗ = d∗ =∞, and the primal
is infeasible.
Also, for any primal feasible x and dual feasible λ, we have c⊤x ≥ b⊤λ.
Theorem (Strong duality)
Let p∗ and d∗ be the primal and dual optimal values. If one of the primal
or dual problems is feasible, then p∗ = d∗.
Nam Ho-Nguyen QBUS2310 Management Science
104/214
Complementary slackness
Recall: strong duality says that if primal and dual are feasible, then
p∗ = d∗, i.e., there exists (primal and dual optimal) x and λ such that
c⊤x = b⊤λ.
From our derivations, we have
b⊤λ =
∑
i∈ML
biλi +
∑
i∈ME
biλi +
∑
i∈MG
biλi
≤
∑
i∈ML
(a⊤i x)λi +
∑
i∈ME
(a⊤i x)λi +
∑
i∈MG
(a⊤i x)λi
=
∑
j∈NL
(a˜⊤j λ)xj +
∑
j∈NF
(a˜⊤j λ)xj +
∑
j∈NG
(a˜⊤j λ)xj
≤
∑
j∈NL
cjxj +
∑
j∈NF
cjxj +
∑
j∈NG
cjxj
= c⊤x .
For x , λ such that c⊤x = b⊤λ, the two inequalities must hold with
equality. What can we infer from this?
Nam Ho-Nguyen QBUS2310 Management Science
105/214
Complementary slackness
First equality∑
i∈ML
biλi+
∑
i∈ME
biλi+
∑
i∈MG
biλi =
∑
i∈ML
(a⊤i x)λi+
∑
i∈ME
(a⊤i x)λi+
∑
i∈MG
(a⊤i x)λi
implies
0 =
∑
i∈ML
(a⊤i x − bi)λi︸︷︷︸
a⊤i x≤bi ,λi≤0
+
∑
i∈ME
(a⊤i x − bi)λi︸︷︷︸
a⊤i x=bi
+
∑
i∈MG
(a⊤i x − bi)λi︸︷︷︸
a⊤i x≥bi ,λi≥0
.
Every term (a⊤i x − bi)λi ≥ 0 just from primal and dual constraints. Since
we know they all sum to 0, then each term must be 0.
(α ≥ 0, β ≥ 0, α+ β = 0 =⇒ α = β = 0.)
Nam Ho-Nguyen QBUS2310 Management Science
106/214
Complementary slackness
We deduce the following useful result.
Theorem (Complementary slackness)
Let x be primal optimal and λ be dual optimal. Then:
▶ for each primal constraint, λi(a⊤i x − bi) = 0, i.e.,
▶ if a⊤i x ̸= bi , then λi = 0.
▶ if λi ̸= 0, then a⊤i x = bi .
▶ for each dual constraint, (a˜⊤j λ− cj)xj = 0, i.e.,
▶ if a˜⊤j λ ̸= cj , then xj = 0.
▶ if xj ̸= 0, then a˜⊤j λ = cj .
In other words:
▶ at optimality primal constraint i and dual variable i cannot
simultaneously be slack, they must complementarily be slack.
▶ at optimality primal variable j and dual constraint j cannot
simultaneously be slack, they must complementarily be slack.
Nam Ho-Nguyen QBUS2310 Management Science
107/214
Using complementary slackness
Consider a (maximizing) primal LP:
max
x
x1 − x2
s.t. − 2x1 + x2 ≤ 2
x1 − 2x2 ≤ 2
x1 + x2 ≤ 5
x ≥ 0
and its (minimizing) dual:
min
λ
2λ1 + 2λ2 + 5λ3
s.t. − 2λ1 + λ2 + λ3 ≥ 1
λ1 − 2λ2 + λ3 ≥ −1
λ ≥ 0.
Is x1 = 1, x2 = 4 an optimal primal solution?
Nam Ho-Nguyen QBUS2310 Management Science
108/214
Using complementary slackness
If x1 = 1, x2 = 4 is optimal, then there should be a corresponding dual
solution satisfying complementary slackness.
1. Check each primal constraint:
2 ≥ −2x1 + x2 = 2
2 > x1 − 2x2 = −7
5 ≥ x1 + x2 = 5.
Therefore (x1 = 1, x2 = 4) is feasible. Also, if there is a dual
solution, it must satisfy λ2 = 0.
2. We have λ2 = 0. Also, x1 = 1 > 0, x2 = 4 > 0 so both constraints
of the dual hold with equality. We solve for the dual solution:
−2λ1 + λ3 = 1
λ1 + λ3 = −1.
This solves as λ1 = −2/3 and λ3 = −1/3, which is not feasible.
Therefore there is no corresponding dual solution that satisfies
complementary slackness, hence (x1 = 1, x2 = 4) is not optimal.
Nam Ho-Nguyen QBUS2310 Management Science
109/214
Using complementary slackness
max
x
x1 − x2
s.t. − 2x1 + x2 ≤ 2
x1 − 2x2 ≤ 2
x1 + x2 ≤ 5
x ≥ 0
Exercise: use complementary slackness to check whether x1 = 4, x2 = 1
is optimal or not.
Nam Ho-Nguyen QBUS2310 Management Science
110/214
6. Network flow models
Nam Ho-Nguyen QBUS2310 Management Science
111/214
Network models
Many real-world problems are intimately related to networks.
▶ Transportation and logistics networks.
▶ Electricity networks.
▶ Social networks.
▶ Financial networks.
Some LP models can be represented graphically via a network.
Conversely, problems on networks can be modelled as LPs.
Nam Ho-Nguyen QBUS2310 Management Science
112/214
What is a network?
Definition (Network)
A network is a directed graph (N ,A) where N is a set of nodes, and
A ⊂ N ×N are arcs, consisting of ordered pairs of nodes.
▶ Nodes represent entities (location, people, computers).
▶ Arcs represent directed connections between entities (roads
between two locations, whether a person is following another on
social media, connections between two computers).
Nam Ho-Nguyen QBUS2310 Management Science
113/214
What is a network?
N = {1, 2, 3, 4, 5, 6}
A = {(1, 3), (2, 1), (2, 4), (3, 2), (3, 5), (4, 3), (4, 6), (5, 6), (6, 4)}
Arc (4, 6) is different to arc (6, 4). But they are drawn with a
bidirectional arrow, which represents two edges.
Nam Ho-Nguyen QBUS2310 Management Science
114/214
Logistics example
Suppose we wish to transport goods from warehouses to customer
locations. Some customer locations are hubs, meaning that they can
transship goods that they receive.
▶ Warehouses: Phoenix, Austin, Gainesville.
▶ Hubs: Dallas, Atlanta
▶ Customer locations: Chicago, Los Angeles, New York.
Note that warehouses and hubs do not necessarily ship to all cities (due
to availability of transport routes). The links can be represented via a
network.
Nam Ho-Nguyen QBUS2310 Management Science
115/214
Logistics example
Network visualization:
Nam Ho-Nguyen QBUS2310 Management Science
116/214
Logistics example
Costs for transporting one unit from “row city” to “column city”:
cij Dallas Atlanta Chicago Los Angeles New York
Phoenix 3 7 6 3
Austin 2 5 7
Gainesville 6 4 2
Dallas 2 4 5 6
Atlanta 2 4 5
Supplies (total = 1100):
Phoenix Austin Gainesville
700 200 200
Demands (total = 1100):
Dallas Atlanta Chicago Los Angeles New York
300 150 200 200 250
Nam Ho-Nguyen QBUS2310 Management Science
117/214
Logistics example
Build the model: let N be cities, A be arcs in the network.
▶ Decisions: xij = number of units to send from i to j , for each
(i , j) ∈ A.
▶ Constraints: xij ≥ 0, and need to meet supply/demand. For each
city i ∈ N , ∑k:(k,i)∈A xki is the total amount shipped in,∑
j:(i,j)∈A xij is the total amount shipped out. Then we need∑
j:(i,j)∈A
xij −
∑
k:(k,i)∈A
xki = bi ∀i ∈ N ,
where bi is positive supply or negative demand.
▶ Objective: minimize total cost of all shipments∑
(i,j)∈A
cijxij
where cij are costs given in the table.
Nam Ho-Nguyen QBUS2310 Management Science
118/214
Logistics example
Solution with objective $4100
xij Dallas Atlanta Chicago Los Angeles New York
Phoenix 300 0 200 200
Austin 200 0 0
Gainesville 0 0 200
Dallas 150 0 0 50
Atlanta 0 0 0
Nam Ho-Nguyen QBUS2310 Management Science
119/214
Logistics example
Network visualization of solution:
Nam Ho-Nguyen QBUS2310 Management Science
120/214
Observations
▶ We saw in the logistics example that the solution x only had integer
entries.
▶ This is not a coincidence, it is one of the key properties of network
flow models: if supply/demand are integers then solution will be
integer.
▶ If we are able to formulate our problem with a network flow model,
then this is good, we don’t need to worry about integer constraints
as long as our data are integer.
▶ Other advantages: special network flow models admit faster
algorithms than plain simplex. (We will not cover these.)
Nam Ho-Nguyen QBUS2310 Management Science
121/214
Minimum cost flow model
The most general network flow model is the minimum cost flow model.
Think of “flow” as some quantity (water, electricity, money, product)
that needs to be routed around the network. Given a network (N ,A):
▶ Decisions: variables xij for each (i , j) ∈ A tell us how much flow
travels across arc (i , j).
▶ Flow bound constraints: ℓ ≤ x ≤ u, i.e., ℓij ≤ xij ≤ uij for each
(i , j) ∈ A. Note that uij could be +∞, which means xij is
unbounded. Typically, ℓij = 0.
▶ Objective: directing flow from i to j costs cij .
minimize
∑
(i,j)∈A
cijxij .
Nam Ho-Nguyen QBUS2310 Management Science
122/214
Minimum cost flow model
Given a network (N ,A), let x be a vector of flow variables on the arcs.
Define the shorthand, for node i ∈ N and x ,
outflow(i ; x) :=
∑
j∈N :(i,j)∈A
xij
inflow(i ; x) :=
∑
k∈N :(k,i)∈A
xki .
▶ Flow balance constraints: each node i ∈ N is either a supply or
demand node, given by a parameter bi . If bi > 0, i is a supply
node; if bi < 0, i is a demand node; if bi = 0 it is a
transshipment node.
bi = outflow(i ; x)− inflow(i ; x) =
∑
j:(i,j)∈A
xij −
∑
k:(k,i)∈A
xki .
Nam Ho-Nguyen QBUS2310 Management Science
123/214
Minimum cost flow model
The model is
min
x
∑
(i,j)∈A
cijxij
s.t. outflow(i ; x)− inflow(i ; x) = bi , i ∈ N
ℓ ≤ x ≤ u.
Nam Ho-Nguyen QBUS2310 Management Science
124/214
Minimum cost flow in matrix notation
The key difference between network flow models and general LP are the
flow balance constraints:
outflow(i ; x)− inflow(i ; x) = bi .
Given a network (N ,A), the node-arc incidence matrix A has a row for
every node i ∈ N and a column for every arc (j , k) ∈ A.
▶ ai,ij = 1 for any j such that (i , j) ∈ A.
▶ ai,ji = −1 for any j such that (j , i) ∈ A.
▶ ai,jk = 0 for every other entry.
Nam Ho-Nguyen QBUS2310 Management Science
125/214
Minimum cost flow in matrix notation
N = {1, 2, 3, 4, 5, 6}
A = {(1, 3), (2, 1), (2, 4), (3, 2), (3, 5), (4, 3), (4, 6), (5, 6), (6, 4)}
A (1, 3) (2, 1) (2, 4) (3, 2) (3, 5) (4, 3) (4, 6) (5, 6) (6, 4)
1 1 −1
2 1 1 −1
3 −1 1 1 −1
4 −1 1 1 −1
5 −1 1
6 −1 −1 1
Nam Ho-Nguyen QBUS2310 Management Science
126/214
Minimum cost flow in matrix notation
Let b = (bi)i∈V be the vector with entries bi , arranged in the same order
as the rows of A. Then the flow balance constraints can be written as
Ax = b.
The matrix A has a lot of structure, which is why network flow models
possess such nice properties.
▶ Entries are −1, 0,+1 only.
▶ Each column only has two nonzero entries, +1 and −1.
Assume
∑
i∈N bi = 0. If not, Ax = b is infeasible (because 1
⊤Ax = 0).
Theorem (Integer solutions)
Let A be the node-arc incidence matrix of a network. Consider any LP
with constraints Ax = b, ℓ ≤ x ≤ u. Suppose that b, ℓ and u only have
integer entries. Then there exists an optimal solution x∗ to the LP that
only has integer entries.
Nam Ho-Nguyen QBUS2310 Management Science
127/214
Comments
Why don’t we just use LP?
▶ We could, but knowing our problem is representable as a network
flow model is very useful.
▶ We guarantee that integer data =⇒ integer solutions.
▶ Important for problems with discrete quantities (e.g., number of
goods).
▶ Faster algorithms exist for special network flow models, important
for large-scale problems (beyond the scope of this course).
Nam Ho-Nguyen QBUS2310 Management Science
128/214
Special network flow models
▶ Transportation problem.
▶ Assignment problem.
▶ Shortest path problem.
▶ Maximum flow problem.
Nam Ho-Nguyen QBUS2310 Management Science
129/214
Transportation problem
▶ Supply and demand nodes only N = S ∪ D (no transshipment),
bi > 0 for i ∈ S, bi < 0 for i ∈ D.
▶ Arcs only go from supply to demand nodes. Such a network is called
bipartite.
▶ Pairs of supply-demand nodes (i , j) have different costs cij .
▶ No arc capacities uij = +∞.
Nam Ho-Nguyen QBUS2310 Management Science
130/214
Example: transportation problem
Two bread factories, F1 and F2 make the daily bread in a city. The bread
is delivered to the three bakeries of the city: B1, B2 and B3. The supplies,
demands and unit costs of transportation are given in the tables below.
demands
1500 2000 1000
supplies B1 B2 B3
2000 F1 8 6 10
2500 F2 10 4 9
▶ Model this as a transportation problem.
▶ Suppose the supply for F1 was 3000 instead. Model this as a
transportation problem.
▶ Suppose the demand for B3 was 1500 instead. Model this as a
transportation problem.
Nam Ho-Nguyen QBUS2310 Management Science
131/214
Assignment problem
▶ Exactly the same as a transportation problem, except that bi = 1 for
all i ∈ S and bi = −1 for all i ∈ D.
▶ This means that each supply node is matched to exactly one
demand node. Implicitly, we need |S| = |D|.
▶ The problem is also known as (weighted) bipartite matching.
▶ Typical application: supply nodes are people, and demand nodes are
projects. Arc costs are a measure of “mismatch” for assigning a
particular person to a particular project.
Nam Ho-Nguyen QBUS2310 Management Science
132/214
Shortest path problem
▶ One supply node s with bs = 1, one demand node t with bt = −1,
the rest bi = 0.
▶ Costs are all positive cij > 0; think of these as lengths. Upper
bounds are uij = 1.
▶ Problem will find the shortest length path from s to t.
▶ Typical applications: Google Maps, maintenance scheduling.
Nam Ho-Nguyen QBUS2310 Management Science
133/214
Example: shortest path problem
You are going on vacation from 1st December to 31st January and wish
to sublet your apartment for short-term rentals. Potential renters can
submit bids consisting of:
▶ an arrival date
▶ a departure date
▶ a price they are willing to pay for the apartment during that period.
You can choose which bids to accept and reject. However, you must
ensure that no two renters are in the apartment at the same time, except
possibly if one departs and the other arrives on the same day.
Formulate a shortest path model that selects bids in order to maximize
your profits.
Nam Ho-Nguyen QBUS2310 Management Science
134/214
Example: shortest path problem
▶ Nodes: one for each day d between 1st Dec and 31st Jan.
▶ Arcs: each bid is of form (d1, d2, p), where d1 is arrival date, d2 is
departure date and p is the price. For each bid:
▶ add arc from d1 → d2 with cost −p. Why is it negative cost?
Then, for all days d up to 30th Jan, add arc from d → d + 1 with
cost 0.
▶ Why do we do this for all days instead of just days without incoming
arcs?
▶ Shortest path: why does a path from 1st Dec to 31st Jan give a
feasible solution to the problem?
Nam Ho-Nguyen QBUS2310 Management Science
135/214
Maximum flow problem
▶ One source node s, one sink node t connected by arc (t, s). Set
bi = 0 for all nodes (including s and t).
▶ Costs cij = 0, upper bounds positive uij > 0, except cts = −1,
uts = +∞
▶ Problem finds the maximum amount of flow that can be
generated from s and sent to t (must respect arc capacities).
▶ We wish to minimize ctsxts = −xts , i.e., maximize xts . But since
bt = 0, the flow sent from t to s along (t, s) must come from the
rest of the network, hence s.
▶ Typical applications: elimination of teams in sporting tournaments,
scheduling, image segmentation.
Nam Ho-Nguyen QBUS2310 Management Science
136/214
Example: maximum flow problem
Suppose as much as 500MB of data per second can be sent between any
two of the computers 1, 2, 3, and 4. Set up a maximum flow model that
can be used to determine the maximum amount of data that can be sent
in two seconds from computer 1 to computer 4.
▶ What are the nodes? What are the source and sink nodes?
▶ We have a node (n, τ) for each computer n = 1, 2, 3, 4 and time step
τ = 0, 1, 2. This is because flow from (n, τ)→ (n′, τ ′) represents a
transfer of data from n to n′, starting at time τ and finishing at time
τ ′.
▶ Are all nodes necessary?
▶ Source node (1, 0), sink node (4, 2).
▶ What are the arcs and capacities?
▶ Add arc for each (n, τ)→ (n′, τ + 1). Are all arcs necessary?
▶ Capacities are 500 for each arc.
Nam Ho-Nguyen QBUS2310 Management Science
137/214
Example: maximum flow model
1, 0
1, 1 2, 1 3, 1 4, 1
4, 2
Nam Ho-Nguyen QBUS2310 Management Science
138/214
Example: maximum flow model
▶ What does the vertical arc (1, 0)→ (1, 1) represent?
▶ How would you change the model if each computer could send a
maximum of 1000MB each second?
▶ How would you change the model if instead there were 6 computers,
each can send a maximum of 1000MB per second, and we are
interested in how much computers 1 and 2 combined can send to
computers 5 and 6 over 10 seconds?
Nam Ho-Nguyen QBUS2310 Management Science
139/214
Modelling technique: outflow capacities
Given a network (N ,A), suppose that we have a network flow model,
except certain nodes i have outflow capacities outflow(i ; x) ≤ κi . Let
K ⊂ N be the set of nodes with capacities. The problem is
min
x
c⊤x
s.t. Ax = b
0 ≤ x ≤ u
outflow(i ; x) ≤ κi , i ∈ K.
How do we model this as a network flow problem?
Nam Ho-Nguyen QBUS2310 Management Science
140/214
Modelling technique: outflow capacities
Consider one such node i with capacity κi . Suppose for simplicity it has
arcs (j1, i), (j2, i), (i , j3), (i , j4). We need the constraint xi,j3 + xi,j4 ≤ κi .
Nam Ho-Nguyen QBUS2310 Management Science
141/214
Modelling technique: outflow capacities
Key idea:
▶ split node i into two nodes, iin and iout
▶ add edge (iin, iout) with capacity uiin,iout = κi
▶ Make the supplies biin = bi , biout = 0
▶ replace incoming arcs to i with arcs to iin, i.e., arcs (j1, iin), (j2, iin)
▶ replace outgoing arcs from i with arcs from iout, i.e., arcs
(iout, j3), (iout, j4).
Then implicitly we know xiout,j3 + xiout,j4 = xiin,iout ≤ uiin,iout = κi .
Do this for every capacitated node.
Nam Ho-Nguyen QBUS2310 Management Science
142/214
Modelling technique: inflow capacities
How would you model inflow capacities on nodes
inflow(i ; x) ≤ κi?
Same as before, except set biin = 0, biout = bi . Why does this change
make a difference?
Nam Ho-Nguyen QBUS2310 Management Science
143/214
7. Integer programming models
Nam Ho-Nguyen QBUS2310 Management Science
144/214
Integer programming models
An integer linear program (or simply integer program (IP)) is simply
an LP with additional constraints that some variables must be integer:
min
x
c⊤x
s.t. Ax ≤ b
xj ∈ Z, j ∈ N
where N ⊆ [n] is an index set for variables required to be integer.
▶ When all variables are required to be integer (N = [n]), then we say
it is a pure integer program (pure IP).
▶ When some variables are integer and others are continuous, we
say it is a mixed integer program (MIP).
▶ Note that xj ∈ Z and 0 ≤ xj ≤ 1 means that xj ∈ {0, 1}. When all
integer variables are in {0, 1}, we say this is a (pure or mixed) 0-1
program.
▶ The model without the integrality constraints is called the LP
relaxation of the IP.
Nam Ho-Nguyen QBUS2310 Management Science
145/214
Example 1: importance of integer programming
Question: can we just solve the LP relaxation and round?
Example: a nutritionist advising an individual suffering from iron and
vitamin B deficiency to increase their monthly intake to
▶ 2400mg of iron
▶ 2100mg of vitamin B1
▶ 1500mg of vitamin B2.
There are two available pills to supplement the individual’s diets:
cost iron vitamin B1 vitamin B2
Pill 1 67c 30mg 10mg 5mg
Pill 2 100c ($1) 10mg 15mg 15mg
How many pills of each should the individual buy to minimize the
individual’s costs while fulfilling the dietary needs?
Nam Ho-Nguyen QBUS2310 Management Science
146/214
Example 1: importance of integer programming
IP formulation:
min
x1,y2
67x1 + 100x2
s.t. 30x1 + 10x2 ≥ 2400
10x1 + 15x2 ≥ 2100
5x1 + 15x2 ≥ 1500
x1 ≥ 0, x2 ≥ 0
x1, x2 ∈ Z.
Note that we have x1, x2 ∈ Z since pills are too small to divide. Solution:
LP relaxation integer solution
Pill 1 42.857 45
Pill 1 111.429 110
cost $140.14 $140.15
Nam Ho-Nguyen QBUS2310 Management Science
147/214
Example 1: importance of integer programming
What about the four points around the solution of the LP relaxation?
point 1 point 2 point 3 point 4
42 42 43 43
111 112 111 112
not enough iron, B1 not enough iron not enough B1 cost $140.81
▶ Rounding the LP relaxation solution doesn’t give an optimal
solution.
▶ In fact, it can happen that rounding may not even give a feasible
solution!
▶ Even if it does, finding the right way to round is itself another hard
problem, in fact a 0-1 program (with n integer variables, there are 2n
ways to round).
▶ What about “fast heuristics” for rounding? Sometimes works for
special problems, but not in general.
▶ Rounding a 0-1 problem na¨ıvely may produce very suboptimal
solutions.
Nam Ho-Nguyen QBUS2310 Management Science
148/214
Hardness of solving IP models
▶ Integer programming models take more time to solve in practice
than LP models.
▶ However, the techniques have developed sufficiently that IP is
practically useful for a wide range of problems (we can solve IPs
with tens of thousands of variables/constraints).
▶ Two main techniques are used to solve general IP models (mostly
used in combination):
▶ Branch-and-bound (divide and conquer). E.g., for the nutrition
problem, solve two separate LPs with constraints x1 ≤ 42, x1 ≥ 43
respectively. If these still don’t give integer solutions, divide further
as necessary, while using past information to eliminate certain LPs
from contention (conquering).
▶ Actually, this was developed by an Australian woman in 1960, Alison
Harcourt ne`e Doig, together with Ailsa Land from the UK.
▶ Cutting plane methods. Generate constraints in a clever way to
force integrality (also requires repeatedly solving LPs).
These techniques are outside the scope of this course (but very
interesting/important).
Nam Ho-Nguyen QBUS2310 Management Science
149/214
When should we use IP?
A natural setting to use IP is when we need whole number quantities.
▶ Note: even when we need whole numbers, we might be able to get
away with LP only! E.g., what is the difference between producing
3572 chairs versus 3572.3 chairs? A very small profit, relatively
speaking. This is a situational judgement call though, try both!
A very important setting where LP comes up short is when we have
categorical decisions.
▶ E.g., choice amongst a finite number of alternatives (yes/no, set of
locations, set of actions).
▶ Note that the alternatives need not even be quantitative.
▶ Non-quantitative decision: which brand should I stock?
▶ Quantitative decision: how much of each brand should I stock?
▶ We still need to be able to quantify the effects of the choices.
▶ If I stock this brand, then I will make on average $2000 more profit.
▶ If I stock this other brand, I have to use two more shelves of storage
space.
Nam Ho-Nguyen QBUS2310 Management Science
150/214
Modelling technique: one-hot encoding for categorical
selection
Suppose we have to select between n different alternatives.
▶ Define n variables x1, . . . , xn for each alternative.
▶ Require xj ∈ {0, 1}.
▶ Add constraint
∑
j∈[n] xj = 1,
Then
xj =
{
1, we choose alternative j
0, we don’t choose alternative j ,
and we must have exactly one xj = 1.
Changing:
▶ ∑j∈[n] xj = k means we select exactly k of the alternatives.
▶ ∑j∈[n] xj ≤ k means we select at most k of the alternatives.
▶ ∑j∈[n] xj ≥ k means we select at least k of the alternatives.
Nam Ho-Nguyen QBUS2310 Management Science
151/214
Example 2: facility location
Suppose that we have n different suburbs. We want to select some of
these to be locations for fire stations. The data we have are:
▶ the expected number of yearly fire calls for each suburb ej .
▶ the travel costs between pairs of suburbs cij .
▶ the recurring yearly costs of maintaining a fire station in a suburb fj .
How can we decide which suburbs to place fire stations in to minimize
the yearly cost?
Categorical decisions: xj for each j ∈ [n] binary variables,
xj =
{
1, a fire station is located in suburb j
0, no fire station in suburb j .
Constraints: no constraints on the number of fire stations (for this
particular problem).
Nam Ho-Nguyen QBUS2310 Management Science
152/214
Example 2: facility location
Objective: if there is a fire station at i which serves suburb j , and a fire
at j , then it costs cij to travel there and put it out. Yearly, the cost is
ejcij . We wish to minimize the total travel cost plus the maintenance
costs of the fire stations:
minimize
∑
j∈[n]
ej
∑
i∈[n]
cij · 1(station at i serves j)
+ fjxj
 .
▶ For each suburb, we select a fire station to serve it, thus introduce
binary variables
yij =
{
1, fire station at suburb i serves suburb j
0, otherwise.
Nam Ho-Nguyen QBUS2310 Management Science
153/214
Example 2: facility location
Constraints to model the objective: One fire station is selected for
each suburb ∑
i∈[n]
yij = 1.
Selected i must be chosen as fire stations (yij = 1 =⇒ xi = 1)
yij ≤ xi , i , j ∈ [n].
Objective: the cost of serving suburb j is∑
i∈[n]
cijyij .
The total travel cost weighted by expected number of calls is
∑
j∈[n]
∑
i∈[n]
ejcijyij
 .
Nam Ho-Nguyen QBUS2310 Management Science
154/214
Example 2: facility location
The complete model is
min
x ,y
∑
j∈[n]
fjxj +∑
i∈[n]
ejcijyij

s.t. x ∈ {0, 1}n, y ∈ {0, 1}n×n∑
i∈[n]
yij = 1, j ∈ [n]
yij ≤ xi , i , j ∈ [n].
In fact, we don’t need to impose binary constraints on y , having
0 ≤ y ≤ 1 is enough. Why?
Nam Ho-Nguyen QBUS2310 Management Science
155/214
Modelling with binary variables
Much of the applicability of integer programming comes from using
binary variables to model logical relationships between different sets of
variables/constraints. There are some standard techniques for doing this,
which we go through:
▶ products of binary variables.
▶ no-good constraints.
▶ implication relationships.
▶ big-M technique.
▶ disjunctive constraints.
Nam Ho-Nguyen QBUS2310 Management Science
156/214
Products of binary variables
Let x = (x1, . . . , xn) be a vector of binary variables, and let y be a
binary variable.
▶ We wish to capture the constraint y = x1 × x2 × · · · × xn.
▶ In other words, y = 1 if and only if xj = 1 for all j ∈ [n].
Equivalently, y = 0 if and only if there exists at least one xj = 0.
▶ Constraints y ≤ xj for all j ∈ [n] force y = 0 when any xj = 0. But
we still can have y = 0 while xj = 1 for all j ∈ [n].
▶ How can we force y = 1 when all xj = 1? This constraint does it:
y ≥∑j∈[n] xj − (n − 1). If each xj = 1, the sum is n, so the RHS is
1. If not, the sum is ≤ n − 1 so the RHS is ≤ 0, which means the
constraint is redundant.
In summary, for binary x1, . . . , xn, y ,
y = x1 × · · · × xn ⇐⇒ y ≤ xj ∀j ∈ [n]
y ≥
∑
j∈[n]
xj − (n − 1).
Nam Ho-Nguyen QBUS2310 Management Science
157/214
No-good constraints
Let a = (a1, . . . , an) be a fixed binary vector, and x = (x1, . . . , xn) be a
vector of binary variables.
▶ We want to prevent x from being equal to a (in other words, a is
“no good”).
▶ This constraint does it:∑
j∈[n]
(ajxj + (1− aj)(1− xj)) ≤ n − 1.
▶ Reason: Notice that ajxj + (1− aj)(1− xj) = 1 ⇐⇒ aj = xj . If
they are all equal then the sum is n. The constraint ensures that
they won’t be equal for at least one j .
Exercise: assuming that at least one aj = 1, formulate the constraint
which prevents x ≥ a (i.e., we want to ensure there is at least one j such
that xj = 0, aj = 1).
Nam Ho-Nguyen QBUS2310 Management Science
158/214
Implication relationships
Suppose we have two binary variables x , y ∈ {0, 1}. How can we model
the implication relationship
x = 1 =⇒ y = 1?
▶ Another name for this is a forcing constraint, since whenever x = 1
we need to force y = 1.
▶ In the facility location example, we saw that if yij = 1 (a station at i
is selected by j) then we need to force xi = 1 (a station is located at
i).
We capture this with the linear constraint
x ≤ y .
Exercise: model the following implications
x = 0 =⇒ y = 1
x = 1 =⇒ y = 0
x = 0 =⇒ y = 0.
Nam Ho-Nguyen QBUS2310 Management Science
159/214
Big-M technique for switching constraints on/off
Suppose we have a linear constraint a⊤x ≤ b.
▶ We have a model where a⊤x ≤ b is imposed only under certain
conditions. How can we model “switching” the constraint on and
off?
▶ We can add a binary variable y ∈ {0, 1} and attempt to model the
logical implications
y = 1 =⇒ a⊤x ≤ b
y = 0 =⇒ no constraint imposed.
▶ We also need to relate y to the conditions which switch the
constraint on/off. For example, we may have to pay a cost c for
removing the constraint, so we add cy to the objective. But in
general, capturing the conditions is a separate issue.
Nam Ho-Nguyen QBUS2310 Management Science
160/214
Big-M technique for switching constraints on/off
▶ How can we linearize the two implications
y = 1 =⇒ a⊤x ≤ b
y = 0 =⇒ no constraint imposed?
▶ Suppose we (somehow) knew that a⊤x will never be larger than
b +M (M is a big number). Then we add constraint
a⊤x ≤ b +M(1− y).
The idea is that when y = 0, the constraint we impose is
a⊤x ≤ b +M, which is redundant.
▶ In most models, a sufficiently large M can be found. However, solver
performance is better when M is as small as possible.
Nam Ho-Nguyen QBUS2310 Management Science
161/214
Example 3: transportation with ramp-up costs
Suppose we have production plants S each of which can produce si
quantities of a certain good. We have retailers D, which demand dj
quantities of the good. Routes between plants and retailers are given by
a bipartite network (S ∪ D,A). Our problem is to transport the goods
from plants to retailers in order to meet demand.
▶ Transporting one unit from i to j costs cij .
▶ Deciding to produce goods at plant i incurs some ramp-up costs ri
(warming up machines, starting power, etc). On the other hand if no
goods are produced at plant i , then no ramp-up costs are incurred.
Nam Ho-Nguyen QBUS2310 Management Science
162/214
Example 3: transportation with ramp-up costs
▶ Decisions:
xij := amount of goods transported from plant i to retailer j
yi :=
{
1, if plant i produces goods
0, otherwise.
Here xij ≥ 0, yi ∈ {0, 1}.
▶ Objective: minimize the ramp-up + transportation costs∑
i∈S
riyi +
∑
(i,j)∈A
cijxij .
▶ Constraints: the usual supply and demand constraints (remember
that the network is bipartite)∑
j:(i,j)∈A
xij ≤ si , i ∈ S
∑
i :(i,j)∈A
xij ≥ dj , j ∈ D.
Nam Ho-Nguyen QBUS2310 Management Science
163/214
Example 3: transportation with ramp-up costs
▶ Constraints: we need to consider how yi and xij interact, which is
through the implications
yi = 0 =⇒ xij = 0
yi = 1 =⇒ no constraint imposed.
We can do this with xij ≤ Myi , but what should M be? Recall that∑
j:(i,j)∈A xij ≤ si , and xij ≥ 0, so we know xij ≤ si . Therefore we
can use M = si ,
xij ≤ siyi , ∀j s.t. (i , j) ∈ A.
A way to impose all constraints for i simultaneously is∑
j:(i,j)∈A
xij ≤ siyi .
Why does this work?
Nam Ho-Nguyen QBUS2310 Management Science
164/214
Example 3: transportation with ramp-up costs
The full model:
min
x
∑
i∈S
riyi +
∑
(i,j)∈A
cijxij
s.t. x ≥ 0, y ∈ {0, 1}S∑
i :(i,j)∈A
xij ≥ dj , j ∈ D
∑
j:(i,j)∈A
xij ≤ siyi , i ∈ S.
Nam Ho-Nguyen QBUS2310 Management Science
165/214
Disjunctive constraints
We wish to switch between two different constraints:
a⊤x ≤ b, c⊤x ≤ d .
Since we switch between two constraints, we know that at least one
must be satisfied, thus we can write this as
a⊤x ≤ b or c⊤x ≤ d .
The “or” relationship is known as a disjunction. How can we model this
with linear constraints?
Nam Ho-Nguyen QBUS2310 Management Science
166/214
Disjunctive constraints
We wish to model the disjunction
a⊤x ≤ b or c⊤x ≤ d .
Define a binary variable y ∈ {0, 1}, and model the implications
y = 0 =⇒ a⊤x ≤ b
y = 1 =⇒ c⊤x ≤ d .
Suppose we know a⊤x never exceeds b +Ma, and c⊤x never exceeds
d +Mc . Then add the linear constraints
a⊤x ≤ b +May
c⊤x ≤ d +Mc(1− y).
Nam Ho-Nguyen QBUS2310 Management Science
167/214
Disjunctive constraints
Note that the statement “p or q” is logically equivalent to the statement
¬p =⇒ q (where ¬p is the logical negation, i.e., the logical opposite
statement of p). Therefore
a⊤x ≤ b +May
c⊤x ≤ d +Mc(1− y)
is equivalent to
a⊤x > b =⇒ c⊤x ≤ d .
Reason: we have a⊤x − b ≤ May , therefore
a⊤x − b > 0 forces y = 1, which forces c⊤x ≤ d .
Nam Ho-Nguyen QBUS2310 Management Science
168/214
Disjunctive constraints
What if we have groups of constraints:
Ax ≤ b or Cx ≤ d?
Just use the same binary variables:
a⊤i x ≤ bi +MAy , i ∈ [mA]
c⊤i x ≤ di +MC (1− y), i ∈ [mC ]
y ∈ {0, 1}.
This is equivalent to[∃i ∈ [mA] s.t. a⊤i x > bi] =⇒ Cx ≤ d
or in words: if at least one constraint from the first group does not
hold, then all constraints from the second group must hold.
Nam Ho-Nguyen QBUS2310 Management Science
169/214
Example 4: production with economic feasibility
Dorian Auto is considering manufacturing three types of autos: compact,
midsize and large. The resources required for, and profits yielded by, each
type of car are shown in the table.
car type
resource compact midsize large available
Steel 1.5 tons 3 tons 5 tons 6,000 tons
Labour 30 hours 25 hours 40 hours 60,000 hours
Profit yielded ($) 2,000 3,000 4,000
Due to setup costs, if Dorian chooses to produce a certain type of car, it
is only economically feasible to produce at least 1,000 cars of that type.
Formulate an IP to maximize Dorian’s profit.
Nam Ho-Nguyen QBUS2310 Management Science
170/214
Example 4: production with economic feasibility
Decisions: x1, x2, x3 the number of cars produced of each type. (Does it
make sense to have this as integer variables, or can they be approximated
by continuous variables?)
Resource constraints: we multiply the steel constraint by 2, and divide
the hour constraint by 10 for scaling,
3x1 + 6x2 + 10x3 ≤ 12000
6x1 + 5x2 + 8x3 ≤ 12000
x1, x2, x3 ≥ 0.
Objective: working in 1000s
maximize 2x1 + 3x2 + 4x3.
Nam Ho-Nguyen QBUS2310 Management Science
171/214
Example 4: production with economic feasibility
Economic feasibility constraints: if we produce a nonzero quantity of
type j , then we need to produce at least 1000, i.e., we have the
implication
xj > 0 =⇒ xj ≥ 1000.
This means xj ≤ 0 or xj ≥ 1000. Introduce binary variable yj ∈ {0, 1},
and constraints
xj ≤ Mj,1yj
−xj ≤ −1000 +Mj,2(1− yj).
▶ Choice of big-M. What should Mj,1 be? How large can each xj be?
From 3x1 + 6x2 + 10x3 ≤ 12000, we know that x1 ≤ 4000,
x2 ≤ 2000, x3 ≤ 1200. From 6x1 +5x2 +8x3 ≤ 12000, we know that
x1 ≤ 2000, x2 ≤ 2400, x3 ≤ 1500. Therefore M1,1 = 2000,
M2,1 = 2000, M3,1 = 1200.
What should Mj,2 be? How large can −xj be? We know xj ≥ 0, so
−xj ≤ 0. Therefore set Mj,2 = 1000, hence the constraint is
xj ≥ 1000yj .
Nam Ho-Nguyen QBUS2310 Management Science
172/214
Example 4: production with economic feasibility
Full model:
max
x ,y
2x1 + 3x2 + 4x3
s.t. x ≥ 0, y ∈ {0, 1}3
3x1 + 6x2 + 10x3 ≤ 12000
6x1 + 5x2 + 8x3 ≤ 12000
1000y1 ≤ x1 ≤ 2000y1
1000y2 ≤ x2 ≤ 2000y2
1000y2 ≤ x3 ≤ 1200y3
possibly x1, x2, x3 ∈ Z.
Nam Ho-Nguyen QBUS2310 Management Science
173/214
Modelling technique: letting the objective/constraint do
the work
We actually saw this twice in the previous example:
1. Want to minimize maxi∈[m] fi(x). This is nonlinear, but can recast
min
x
max
i∈[m]
fi(x) ⇐⇒ minx ,t {t : t ≥ fi(x), i ∈ [m]} .
Note that t > maxi∈[m] fi(x) is feasible but not optimal so the
objective automatically drives t down to maxi∈[m] fi(x).
2. Want to model the constraint b ≥ mini∈M(x) di where M(x) ⊂ [m] is
an index set depending on other decision variables x .
▶ Add binary variables yi for each i ∈ [m].
▶ Add constraints for x and y that ensures that yi = 0 whenever
i ̸∈ M(x) (this needs knowledge of what exactly M(x) is, a different
problem).
▶ Add constraints ∑
i∈[m]
yi = 1, b ≥
∑
i∈[m]
diyi .
We are still allowed to choose yi = 1 where di > mini′∈M(x) di′ . But
if we do, we still have b ≥ di > mini′∈M(x) di′ so the constraint is
still satisfied.
Nam Ho-Nguyen QBUS2310 Management Science
174/214
Example 5: facility location revisited
Suppose that we have n different suburbs. We want to select k of these
to be locations for fire stations.
Categorical decisions: xj for each j ∈ [n] binary variables,
xj =
{
1, a fire station is located in suburb j
0, no fire station in suburb j .
Constraints: choose exactly k suburbs∑
j∈[n]
xj = k.
Objective: minimize the largest distance between a suburb and its
closest fire station (where dij is the distance between suburbs i and j):
minimize max
j∈[n]
min
i :xi=1
dij︸︷︷︸
distance between j and
its closest fire station
.
Nam Ho-Nguyen QBUS2310 Management Science
175/214
Example 5: facility location revisited
Objective: We can rewrite
max
j∈[n]
min
i :xi=1
dij = mint
{
t : t ≥ min
i :xi=1
dij , ∀j ∈ [n]
}
.
We can minimize this by adding t as a variable and t ≥ mini :xi=1 dij as
constraints. But are these constraints linear?
▶ Actually they’re not. But the trick is to think of mini :xi=1 dij as
another selection: for each j , we select the station with smallest
distance.
Nam Ho-Nguyen QBUS2310 Management Science
176/214
Example 5: facility location revisited
Objective:
▶ Add binary variables yij ∈ {0, 1} to model this selection: for each
i , j ∈ [n],
yij =
{
1, a fire station in suburb i is selected by suburb j
0, no fire station in i is selected by j .
▶ Constraints to select only one station for each suburb:∑
i∈[n]
yij = 1.
▶ Constraints to ensure only suburbs with fire stations are selected:
xi = 0 =⇒ yij = 0 for all i , j , which can be linearly captured as
yij ≤ xi .
▶ The objective is now
max
j∈[n]
∑
i∈[n]
dijyij = mint
t : t ≥∑
i∈[n]
dijyij , j ∈ [n]
 .
Nam Ho-Nguyen QBUS2310 Management Science
177/214
Example 5: facility location revisited
Model:
min
x ,y ,t
t
s.t. x ∈ {0, 1}n, y ∈ {0, 1}n×n, t ∈ R∑
j∈[n]
xj = k
∑
i∈[n]
yij = 1, j ∈ [n]
yij ≤ xi , i , j ∈ [n]
t ≥
∑
i∈[n]
dijyij , j ∈ [n].
▶ We are not explicitly enforcing that yij = 1 when i is the smallest
distance dij amongst those with xi = 1. But we don’t need to! The
minimization objective will select the “best” one for us.
Nam Ho-Nguyen QBUS2310 Management Science
178/214
8. Optimization models under uncertainty
Nam Ho-Nguyen QBUS2310 Management Science
179/214
Optimization under uncertainty
Consider the optimization model
min
x
f (x)
s.t. fi(x) ≤ bi , i ∈ [m].
In this section, we examine the data that goes into optimization models.
▶ For linear/integer programs these are the objective coefficients c,
constraint coefficients A and right-hand side vector b.
▶ We will call the data ξ. Think of this as a vector or matrix that
stores problem information (costs, resource amounts, prices,
demands, etc).
In many situations, some or all of the data are uncertain, meaning we
have estimates for the values but do not know them exactly. We will
study how to model such problems.
Nam Ho-Nguyen QBUS2310 Management Science
180/214
Example: newsvendor problem
We consider a newsvendor.
▶ The newsvendor must choose a quantity x ∈ Z+ of newspapers to
print on a particular day, at a cost of c > 0 each.
▶ Newspapers are sold at price p > c.
▶ Unsold newspapers must be discarded at cost h > 0.
The demand for a newspaper on a particular day is ξ ∈ Z+, but is
unknown to the newsvendor at the time it must decide on x . How do we
model such a problem?
▶ The objective of the newsvendor is to maximize profit. The profit is
f (x , ξ) = pmin{x , ξ} − cx − hmax{0, x − ξ}.
▶ How can we maximize profit without knowing demand ξ?
Nam Ho-Nguyen QBUS2310 Management Science
181/214
Scenarios
Consider again the generic optimization model with data made explicit
min
x
f (x , ξ)
s.t. fi(x , ξ) ≤ bi , i ∈ [m].
▶ While we don’t know ξ exactly, we still need to have some
knowledge of ξ.
▶ Assume that we have access to a finite set of scenarios ξ1, . . . , ξN .
These are possible outcomes for ξ.
▶ Scenario ξi may come with probability pi ≥ 0 of occurrence (where∑
i∈[N] pi = 1), or may not.
Nam Ho-Nguyen QBUS2310 Management Science
182/214
Scenarios
Where do scenarios come from?
▶ Typically, historical data.
▶ For the newsvendor problem, it may be past observations of
demands.
▶ Note that sometimes there is a probabilistic interpretation of the
scenarios, but not always.
Why do we only consider finitely many scenarios? Probability
distributions such as normal, uniform, exponential, have infinitely many
outcomes.
▶ Historical data is typically finite.
▶ Even if the uncertainty ξ is drawn from a continuous distribution, it
is hard to process this in an optimization model. Therefore a
classical approach is simulation: generate a sample ξ1, . . . , ξN from
the continuous distribution, and use the empirical distribution as an
approximation.
Nam Ho-Nguyen QBUS2310 Management Science
183/214
Optimization models with data uncertainty
Consider again the optimization model
min
x
f (x , ξ)
s.t. fi(x , ξ) ≤ bi , i ∈ [m].
Assume that we have finitely many scenarios ξ1, . . . , ξN for ξ.
▶ Think of the decision-making process as occurring in two steps:
1. We build a proxy optimization model using the scenarios ξ1, . . . , ξN
to help us make a decision xˆN . (In other words, use historical
demand data to choose how many newspapers to print.)
2. The “true” data ξ is revealed, but we must implement our decision
xˆN , and accept its consequences. (The actual demand ξ for that day
is revealed, and we make pmin{xˆN , ξ} − cxˆN − hmax{0, xˆN − ξ}
profit.) Note that ξ may or may not be part of the scenario list.
▶ This is different to solving N copies of the original optimization
model with ξ = ξi plugged in, to get x1, . . . , xN different decisions,
then aggregating these decisions somehow. Why? What are the
pros/cons of each?
Nam Ho-Nguyen QBUS2310 Management Science
184/214
The key issue
For simplicity, we will consider a model with a single uncertain
constraint (extensions to multiple uncertain constraints and uncertain
objectives follow in a straightforward manner from these ideas)
min
x
f (x)
s.t. g(x , ξ) ≤ 0
x ∈ X
where X is a set that contains the exact constraints.
▶ Given fixed x and scenarios ξ1, . . . , ξN , is the constraint g(x , ξ) ≤ 0
satisfied?
▶ We have N different values g(x , ξ1), . . . , g(x , ξN). The constraint
may be satisfied for some, but not for others.
How can we decide on a single x based on N different values
g(x , ξ1), . . . , g(x , ξN)?
Nam Ho-Nguyen QBUS2310 Management Science
185/214
Risk measures
Higher values of g(x , ξ) are “more risky” than lower values, i.e., they
come closer to violating the constraint g(x , ξ) ≤ 0 or may already violate
it.
When given scenarios, some values of g(x , ξi) are more risky than others.
Thus, to meaningfully compare decisions x , x ′, we must
aggregate/measure risk across all scenarios (g(x , ξ1), . . . , g(x , ξN)) in
some way, and then do the same for (g(x ′, ξ1), . . . , g(x ′, ξN)).
A risk measure is simply the name of a formal framework to think about
this idea; we can easily think of several ways of aggregating
(g(x , ξ1), . . . , g(x , ξN)) in order to make our decision, which are all
technically risk measures.
Nam Ho-Nguyen QBUS2310 Management Science
186/214
Risk measures
Formally, a risk measure ρ : RN → R maps a vector of different possible
outcomes (g(x , ξ1), . . . , g(x , ξN)) to a single number
ρ(g(x , ξ1), . . . , g(x , ξN)) ∈ R.
We will use the shorthand
ρg
(
x ; {ξi}i∈[N]
)
:= ρ(g(x , ξ1), . . . , g(x , ξN)).
We then find xˆN by solving
min
x
f (x)
s.t. ρg
(
x ; {ξi}i∈[N]
)
≤ 0
x ∈ X .
Which risk measures should we use?
Nam Ho-Nguyen QBUS2310 Management Science
187/214
Risk measures
A basic monotonicity property:
g(x , ξi) ≤ g(x ′, ξi) ∀i ∈ [N] =⇒ ρg
(
x ; {ξi}i∈[N]
)
≤ ρg
(
x ′; {ξi}i∈[N]
)
.
In other words, if every scenario is riskier, then the aggregate risk
should be greater.
▶ There are other reasonable properties that we can think of, but we
won’t explore these in detail.
▶ Instead, we will focus on specific risk measures that can be modelled
using linear constraints.
Nam Ho-Nguyen QBUS2310 Management Science
188/214
Risk measure 1: expectation
Assume there exists a distribution over the scenarios, i.e., weights
p1, . . . , pN ≥ 0 such that
∑
i∈[N] pi = 1 capturing the likelihood of each
scenario. Define a distribution PˆN as the one that satisfies
Pξ∼PˆN [ξ = ξi ] = pi , i ∈ [N].
The expectation risk measure is simply the expectation of g(x , ξ) over
the distribution PˆN :
ρg
(
x ; {ξi}i∈[N]
)
= Eξ∼PˆN [g(x , ξ)] =
∑
i∈[N]
pig(x , ξi).
▶ If we aren’t given p1, . . . , pN a reasonable choice is pi = 1/N.
▶ If g(x , ξi) is linearly representable for each i , then ρg
(
x ; {ξi}i∈[N]
)
is also. Why?
Nam Ho-Nguyen QBUS2310 Management Science
189/214
Risk measure 2: worst-case value (robust optimization)
The worst-case value risk measure is defined by
ρg
(
x ; {ξi}i∈[N]
)
:= max
i∈[N]
g(x , ξi).
There is no need for likelihood weights pi . If each g(x , ξi) is linearly
representable, then ρg
(
x ; {ξi}i∈[N]
)
is linearly representable, with
ρg
(
x ; {ξi}i∈[N]
)
≤ 0 equivalent to
g(x , ξi) ≤ 0, i ∈ [N].
Nam Ho-Nguyen QBUS2310 Management Science
190/214
Modelling technique: lifting
Suppose we wish to solve
min
x
f (x)
s.t. g(x) ≤ 0
x ∈ X
where
g(x) = min
y
{h(x , y) : y ∈ Y } .
Then this is equivalent to lifting the y-variables into the model
min
x ,y
f (x)
s.t. h(x , y) ≤ 0
x ∈ X , y ∈ Y .
Nam Ho-Nguyen QBUS2310 Management Science
191/214
Problems with expectation and worst-case value
Expectation and worst-case are useful, but have some shortcomings.
▶ Consider two scenarios, and two possible decisions:
i 1 2
pi 0.5 0.5
g(x , ξi) -1 1
g(x ′, ξi) -100 100
Then Eξ∼PˆN [g(x , ξ)] = Eξ∼PˆN [g(x
′, ξ)] = 0. But shouldn’t there be
a difference? Seeing a value of 100 is much worst than 1 (higher
values are bad).
▶ Considering worst-case is sometimes too conservative. If we only
have 0.001% chance of a bad value, then maybe it’s not that bad a
decision overall. The impact is that by considering worst-case, we
exclude more decisions, hence may not achieve the best objective.
Is there a systematic way to better account for risky values, but is less
conservative than worst-case value? Yes!
Nam Ho-Nguyen QBUS2310 Management Science
192/214
VaR and CVaR: intuition
Suppose we have eight scenarios with equal probability. Consider a
decision x with the following outcomes:
i 1 2 3 4 5 6 7 8
pi 1/8 1/8 1/8 1/8 1/8 1/8 1/8 1/8
g(x , ξi) 100 7 5 4 3 1 0 -2
How can we judge how “risky” the decision is?
▶ Expectation: 29.5. Worst-case: 100.
▶ However, 7 out of 8 scenarios have value ≤ 7. Idea: look at the
2nd highest value instead?
▶ Consider an alternative decision x ′:
i 1 2 3 4 5 6 7 8
pi 1/8 1/8 1/8 1/8 1/8 1/8 1/8 1/8
g(x , ξi) 20 7 5 4 3 1 0 -2
The second-highest value is also 7. But shouldn’t there be a
difference between x and x ′? Idea: what if we look at the average
of the two highest values, which is 53.5 for x and 13.5 for x ′?
Nam Ho-Nguyen QBUS2310 Management Science
193/214
VaR and CVaR: intuition
The following two risk measures essentially capture these ideas.
▶ Value-at-risk: Look at the kth largest value amongst
g(x , ξ1), . . . , g(x , ξN).
▶ Conditional value-at-risk: Look at the average of the top k
values amongst g(x , ξ1), . . . , g(x , ξN).
The formal definitions on the next few slides generalize this to settings
where pi are not necessarily equal 1/N probabilities.
Nam Ho-Nguyen QBUS2310 Management Science
194/214
Risk measure 3: value-at-risk (chance constraints)
Assume we have likelihood weights pi for each scenario ξi , and the
distribution PˆN is defined as before. Fix some α ∈ (0, 1). The
value-at-risk at level α, or α-VaR is the risk measure defined as
ρg
(
x ; {ξi}i∈[N]
)
= VaRα(g(x , ξ); PˆN)
:= min
t
{
t : Pξ∼PˆN [g(x , ξ) ≤ t] > α
}
.
This is also referred to as the α-quantile value of PˆN .
When pi = 1/N and α = 1− k/N, then VaRα(g(x , ξ); PˆN) is exactly the
kth largest value amongst g(x , ξ1), . . . , g(x , ξN).
Nam Ho-Nguyen QBUS2310 Management Science
195/214
Risk measure 3: value-at-risk (chance constraints)
VaRα(g(x , ξ); PˆN) := mint
{
t : Pξ∼PˆN [g(x , ξ) ≤ t] > α
}
Suppose we have
i 1 2 3 4 5 6
pi 0.05 0.15 0.1 0.4 0.2 0.1
g(x , ξi) 10 8 6 3 2 -2
Pξ∼PˆN [g(x , ξ) ≤ g(x , ξi)] 1 0.95 0.8 0.7 0.3 0.1
Then
▶ VaR0.98(g(x , ξ); PˆN) = 10
▶ VaR0.95(g(x , ξ); PˆN) = 10
▶ VaR0.85(g(x , ξ); PˆN) = 8
▶ VaR0.8(g(x , ξ); PˆN) = 8
▶ VaR0.7(g(x , ξ); PˆN) = 6
▶ VaR0.6(g(x , ξ); PˆN) = 3.
Nam Ho-Nguyen QBUS2310 Management Science
196/214
Risk measure 3: value-at-risk (chance constraints)
Assume that we can model any constraint of the form g(x , ξi) ≤ bi .
Let’s try and model VaRα(g(x , ξ); PˆN) ≤ 0. By the lifting technique,
VaRα(g(x , ξ); PˆN) ≤ 0 is equivalent to two constraints
t ≤ 0
Pξ∼PˆN [g(x , ξ) ≤ t] > α.
Without loss of generality, we can take t = 0 and just consider
Pξ∼PˆN [g(x , ξ) ≤ 0] > α.
Why? Because Pξ∼PˆN [g(x , ξ) ≤ t] never decreases as t increases.
Nam Ho-Nguyen QBUS2310 Management Science
197/214
Risk measure 3: value-at-risk (chance constraints)
Therefore VaRα(g(x , ξ); PˆN) ≤ 0 is equivalent to a chance constraint
Pξ∼PˆN [g(x , ξ) ≤ 0] > α ⇐⇒ Pξ∼PˆN [g(x , ξ) > 0] ≤ 1− α.
To model this, introduce binary variables zi ∈ {0, 1} for each scenario:
zi =
{
1, g(x , ξi) > 0
0, otherwise.
This can be modelled with the big-M technique:
g(x , ξi) ≤ Mzi , i ∈ [N].
We need to ensure the probability g(x , ξ) > 0 is no greater than 1− α:∑
i∈[N]
pizi ≤ 1− α.
Note: the big-M constraint doesn’t explicitly force zi = 0 when
g(x , ξi) ≤ 0, but the optimization model will implicitly handle this.
Nam Ho-Nguyen QBUS2310 Management Science
198/214
Risk measure 4: conditional value-at-risk
Assume we have likelihood weights pi for each scenario ξi , and the
distribution PˆN is defined as before. Fix some α ∈ (0, 1). The
conditional value-at-risk at level α, or α-CVaR is the risk measure
defined as
CVaRα(g(x , ξ); PˆN) := maxz
1
1− α
∑
i∈[N]
g(x , ξi)zi
s.t. 0 ≤ zi ≤ pi , i ∈ [N]∑
i∈[N]
zi = 1− α.
Even though it’s an LP, the solution can be computed using a greedy
algorithm: start from the largest value g(x , ξi), “fill” zi until the budget∑
i∈[N] zi = 1− α is met.
Nam Ho-Nguyen QBUS2310 Management Science
199/214
Risk measure 4: conditional value-at-risk
Suppose we have
i 1 2 3 4 5 6
g(x , ξi) 10 8 6 3 2 -2
p(i) 0.05 0.15 0.1 0.4 0.2 0.1
CVaR0.98(g(x , ξ); PˆN) = (10× 0.02)/0.02 = 10
Nam Ho-Nguyen QBUS2310 Management Science
199/214
Risk measure 4: conditional value-at-risk
Suppose we have
i 1 2 3 4 5 6
g(x , ξi) 10 8 6 3 2 -2
p(i) 0.05 0.15 0.1 0.4 0.2 0.1
CVaR0.95(g(x , ξ); PˆN) = (10× 0.05)/0.05 = 10
Nam Ho-Nguyen QBUS2310 Management Science
199/214
Risk measure 4: conditional value-at-risk
Suppose we have
i 1 2 3 4 5 6
g(x , ξi) 10 8 6 3 2 -2
p(i) 0.05 0.15 0.1 0.4 0.2 0.1
CVaR0.85(g(x , ξ); PˆN) = (10× 0.05 + 8× 0.1)/0.15 = 8.6˙
Nam Ho-Nguyen QBUS2310 Management Science
199/214
Risk measure 4: conditional value-at-risk
Suppose we have
i 1 2 3 4 5 6
g(x , ξi) 10 8 6 3 2 -2
p(i) 0.05 0.15 0.1 0.4 0.2 0.1
CVaR0.8(g(x , ξ); PˆN) = (10× 0.05 + 8× 0.15)/0.2 = 8.6
Nam Ho-Nguyen QBUS2310 Management Science
199/214
Risk measure 4: conditional value-at-risk
Suppose we have
i 1 2 3 4 5 6
g(x , ξi) 10 8 6 3 2 -2
p(i) 0.05 0.15 0.1 0.4 0.2 0.1
CVaR0.7(g(x , ξ); PˆN) = (10× 0.05 + 8× 0.15 + 6× 0.1)/0.3 = 7.6˙
Nam Ho-Nguyen QBUS2310 Management Science
199/214
Risk measure 4: conditional value-at-risk
Suppose we have
i 1 2 3 4 5 6
g(x , ξi) 10 8 6 3 2 -2
p(i) 0.05 0.15 0.1 0.4 0.2 0.1
CVaR0.6(g(x , ξ); PˆN) = (10×0.05+8×0.15+6×0.1+3×0.1)/0.4 = 6.5
Nam Ho-Nguyen QBUS2310 Management Science
199/214
Risk measure 4: conditional value-at-risk
Suppose we have
i 1 2 3 4 5 6
g(x , ξi) 10 8 6 3 2 -2
p(i) 0.05 0.15 0.1 0.4 0.2 0.1
CVaR0.5(g(x , ξ); PˆN) = (10×0.05+8×0.15+6×0.1+3×0.2)/0.5 = 5.8
Nam Ho-Nguyen QBUS2310 Management Science
200/214
Risk measure 4: conditional value-at-risk
Formally, the greedy algorithm is described as follows.
▶ Order the scenarios so that
g(x , ξ(1)) ≥ · · · ≥ g(x , ξ(N)).
(The order depends on the decision x .)
▶ Find the largest k for which
∑k
i=1 p(i) ≤ 1− α. Note that∑k
i=1 p(i) + p(k+1) > 1− α by definition.
▶ Compute
k∑
i=1
p(i)g(x , ξ(i)) +
(
1− α−
k∑
i=1
p(i)
)
g(x , ξ(k+1)).
▶ Divide the quantity by 1− α.
Nam Ho-Nguyen QBUS2310 Management Science
201/214
Risk measure 4: conditional value-at-risk
Why is it called “conditional” value-at-risk? Because if
Pξ∼PˆN
[
g(x , ξ) > VaRα(g(x , ξ); PˆN)
]
= 1− α, then
CVaRα(g(x , ξ); PˆN) = Eξ∼PˆN
[
g(x , ξ) | g(x , ξ) > VaRα(g(x , ξ); PˆN)
]
.
However, we may have Pξ∼PˆN
[
g(x , ξ) > VaRα(g(x , ξ); PˆN)
]
< 1− α so
this does not always hold.
Nam Ho-Nguyen QBUS2310 Management Science
202/214
Risk measure 4: conditional value-at-risk
To model CVaR, we take the dual of the LP:
CVaRα(g(x , ξ); PˆN) := minr ,t t +
1
1− α
∑
i∈[N]
pi ri
s.t. r ≥ 0
t + ri ≥ g(x , ξi), i ∈ [N].
Exercise: verify this and explain why strong duality holds.
Nam Ho-Nguyen QBUS2310 Management Science
203/214
Risk measure 4: conditional value-at-risk
By the lifting technique, CVaRα(g(x , ξ); PˆN) ≤ 0 is equivalent to the
constraints
t + 11− α
∑
i∈[N]
pi ri ≤ 0
r ≥ 0
t + ri ≥ g(x , ξi), i ∈ [N].
If each g(x , ξi) is linearly representable, then CVaRα(g(x , ξ); PˆN) ≤ 0 is
linearly representable.
Nam Ho-Nguyen QBUS2310 Management Science
204/214
Which risk measure to use?
Modelling aspects:
▶ VaR and CVaR can adjust the degree of risk aversion by adjusting
the α parameter; larger means more risk aversion, i.e., more
emphasis on making large values of g(x , ξi) small.
▶ As α→ 1, both VaRα(g(x , ξ); PˆN)→ maxi∈[N] g(x , ξi) and
CVaRα(g(x , ξ); PˆN)→ maxi∈[N] g(x , ξi). As α→ 0,
VaRα(g(x , ξ); PˆN)→ mini∈[N] g(x , ξi),
CVaRα(g(x , ξ); PˆN)→ Eξ∼PˆN [g(x , ξ)].
▶ CVaRα(g(x , ξ); PˆN) ≥ VaRα(g(x , ξ); PˆN) always. CVaR takes into
account the “tail risk” whereas VaR does not. Important for some
applications, but leads to more conservative solutions.
▶ Guideline: CVaR is used when we have uncertain objective (or when
VaR too computationally demanding), VaR is used when we have
uncertain constraints.
Nam Ho-Nguyen QBUS2310 Management Science
205/214
Which risk measure to use?
Computational aspects:
▶ Expectation and worst-case value are relatively simple; expectation
requires modelling a new function that is the sum of g(x , ξi), while
worst-case value requires N constraints g(x , ξi) ≤ 0.
▶ CVaR requires adding new variables t, r1, . . . , rN all continuous, and
2N + 1 new constraints.
▶ VaR requires adding new binary variables z1, . . . , zN , N + 1
constraints.
▶ Using VaR and CVaR requires choosing the parameter α, an
additional choice.
▶ Using expectation, VaR and CVaR requires knowledge of p1, . . . , pN .
But when these are not given, a reasonable choice for these are
uniform weights pi = 1/N. However, worst-case value avoids
needing the weights.
Nam Ho-Nguyen QBUS2310 Management Science
206/214
Example: newsvendor problem
Recall the newsvendor problem:
▶ The newsvendor must choose a quantity x ∈ Z+ of newspapers to
print on a particular day, at a cost of c > 0 each.
▶ Newspapers are sold at price p > c.
▶ Unsold newspapers must be discarded at cost h > 0.
▶ The objective of the newsvendor is to maximize profit. If demand is
ξ, the profit is
f (x , ξ) = pmin{x , ξ} − cx − hmax{0, x − ξ}.
Future demand is unknown, but we have historical data ξ1, . . . , ξN that
can be treated as scenarios.
Exercise: formulate the newsvendor problem with the different risk
measures.
Nam Ho-Nguyen QBUS2310 Management Science
207/214
Example: newsvendor problem
Exercise: formulate the newsvendor problem with the different risk
measures. First observe that we can write
f (x , ξ) = pmin{x , ξ} − cx − hmax{0, x − ξ}
= min {pξ − cx − h(x − ξ), (p − c)x} .
Why?
▶ When x ≤ ξ, f (x , ξ) = (p − c)x . But also
(p − c)x ≤ (p − c)x − h(x − ξ) ≤ pξ − cx − h(x − ξ).
▶ When x ≥ ξ, f (x , ξ) = pξ − cx − h(x − ξ). But also
(p − c)x ≥ (p − c)x − h(x − ξ) ≥ pξ − cx − h(x − ξ).
Nam Ho-Nguyen QBUS2310 Management Science
208/214
Example: newsvendor problem
Suppose demand scenarios are ξ1, . . . , ξN .
▶ Expectation:
max
x ,t
 1N ∑
i∈[N]
ti :
pξi − cx + h(ξi − x) ≥ ti , i ∈ [N]
(p − c)x ≥ ti , i ∈ [N]
 .
▶ Worst-case:
max
x ,t,v
v :
pξi − cx + h(ξi − x) ≥ ti , i ∈ [N]
(p − c)x ≥ ti , i ∈ [N]
ti ≥ v , i ∈ [N]
 .
Nam Ho-Nguyen QBUS2310 Management Science
209/214
Example: newsvendor problem
Suppose demand scenarios are ξ1, . . . , ξN . Fix α ∈ (0, 1):
▶ α-VaR: maximizing f (x , ξ) can be written as a constraint
f (x , ξ) ≥ t, where we then maximize t. Then the α-VaR version of
f (x , ξ) ≥ t is equivalent to
Pξ∼PˆN [f (x , ξ) < t] ≤ 1− α.
We then wish to maximize t. This is modelled as
max
x ,t,z
t :
t − (pξi − cx + h(ξi − x)) ≤ Mzi , i ∈ [N]
t − (p − c)x ≤ Mzi , i ∈ [N]∑
i∈[N]
zi ≤ 1− α, i ∈ [N]
 .
Nam Ho-Nguyen QBUS2310 Management Science
210/214
Example: newsvendor problem
Suppose demand scenarios are ξ1, . . . , ξN . Fix α ∈ (0, 1):
▶ α-CVaR: maximizing f (x , ξ) can be written as a constraint
f (x , ξ) ≥ t, where we then maximize t. Then the α-CVaR version of
f (x , ξ) ≥ t is modelled by first modelling t − f (x , ξi) ≤ si :
t − (pξi − cx + h(ξi − x)) ≤ si , i ∈ [N]
t − (p − c)x ≤ si , i ∈ [N].
We then model the CVaR for the outcome vector (s1, . . . , sN):
u + 1(1− α)N
∑
i∈[N]
ri ≤ 0
r ≥ 0
u + ri ≥ si , i ∈ [N].
Nam Ho-Nguyen QBUS2310 Management Science
211/214
Example: newsvendor problem
Suppose demand scenarios are ξ1, . . . , ξN . Fix α ∈ (0, 1):
▶ α-CVaR: the full model is
max
x ,t,r ,s,u
t
s.t. t − (pξi − cx + h(ξi − x)) ≤ si , i ∈ [N]
t − (p − c)x ≤ si , i ∈ [N]
u + 1(1− α)N
∑
i∈[N]
ri ≤ 0
r ≥ 0
u + ri ≥ si , i ∈ [N].
Nam Ho-Nguyen QBUS2310 Management Science
212/214
Example: newsvendor problem
Notice however that for fixed x ,
f (x , ξ) = min {pξ − cx − h(x − ξ), (p − c)x}
increases as ξ increases. How can we exploit this fact to simplify the
worst-case, VaR and CVaR models? Hint: under this property, which
is the “worst” scenario?
(This works for this specific newsvendor problem, but it may not work in
general.)
Nam Ho-Nguyen QBUS2310 Management Science
213/214
Summary
Nam Ho-Nguyen QBUS2310 Management Science
214/214
Main ideas
▶ Linear programming: modelling, linear representability, duality.
▶ Networks: integer solutions with integer data.
▶ Integer programming: modelling categorical decisions, modelling
techniques.
▶ Optimization under uncertainty: scenarios, LP formulations of risk
measures.
Nam Ho-Nguyen QBUS2310 Management Science