analysis and optimization代写-IZATION 63|学霸联盟

analysis and optimization代写-IZATION 63

时间：2022-05-04

ANALYSIS AND OPTIMIZATION 63
7. Control Theory
In control theory, we study problems of the form
max
x,u
ˆ t1
t0
f(t, x, u) dt,
where x = x(t) is called the state and u = u(t) is called the control, subject to
x˙ = g(t, x, u), u(t) œ U µ R, x(t0) = x0,
and
(a)x(t1) = x1, (b)x(t1) Ø x1, (c)x(t1) = free, or (d)x(t1) Æ x1.
The set U is called the control region.
Definition 7.1. A pair (x, u) that satisfies the conditions
x˙ = g(t, x, u), u(t) œ U µ R, x(t0) = x0,
and, depending on the problem,
(a)x(t1) = x1, (b)x(t1) Ø x1, (c)x(t1) = free, or (d)x(t1) Æ x1
is a called an admissible pair. An optimal pair is an admissible pair that
maximizes the integral ˆ t1
t0
f(t, x, u) dt,
among all admissible pairs.
As a motivating example, let’s consider an economy evolving over time, where
k = k(t) represents the capital stock,
f = f(k) represents production, and
s = s(t) represents the fraction of production set aside for investment.
Then,
(1≠ s(t))f(k(t)) represents consumption per unit time.
If we wanted to maximize consumption over a given period, we would have the
control problem
max
k,s
ˆ T
0
(1≠ s)f(k) dt,
where k = k(t) is our state and s = s(t) is our control, subject to
k˙ = sf(k), s(t) œ [0, 1], k(0) = k0, and k(T ) Ø kT .
Here, k0 represents the amount of stock we start with, and kT represents the amount
of stock we want to ensure exists at time T .
Like a Lagrangian, we introduce an auxiliary function called a Hamiltonian to
help us solve control theory problems, also known as optimal control problems.
Definition 7.2. For p0 œ R and p = p(t), we define the Hamiltonian
H(t, x, u, p) = p0f(t, x, u) + p(t)g(t, x, u).
The function p = p(t) is called the adjoint function.
Remark 7.3. We can assume that p0 is either equal to 0 or 1. Indeed, if p0 ”= 0,
then divide everything by p0, and redefine the adjoint function as p/p0.
64 ANALYSIS AND OPTIMIZATION
Theorem 7.4 (The Maximum Principle). Let (xú, uú) be an optimal pair. Then,
there is a continuous function pú and a number p0 œ {0, 1}, i.e, p0 = 0 or p0 = 1,
such that
1. (p0, pú(t)) ”= (0, 0) for all t œ [t0, t1].
2. uú maximizes H with respect to u, i.e.,
H(t, xú(t), u, pú(t)) Æ H(t, xú(t), uú(t), pú(t)) for all u œ U.
3. p˙ú = ≠ˆxH(t, xú(t), uú(t), pú(t)).
4. pú satisfies the following transversality condition at t = t1, depending on
the problem
(a)nothing, (b) pú(t1) Ø 0 and pú(t1) = 0 if xú(t1) > x1.
(c) pú(t1) = 0, (d) pú(t1) Æ 0 and pú(t1) = 0 if xú(t1) < x1.
Theorem 7.5 (Mangasarian). Let (xú, uú) be an admissible pair. Suppose that 1.
through 4. of the maximum principle are satisfied for some pú with p0 = 1. If U , the
control region, is convex and H(t, x, u, pú(t)) is concave in (x, u) for all t œ [t0, t1],
then (xú, uú) is an optimal pair.
Given these two theorems, how do we solve an optimal control problem?
Step 0: Make sure the problem is in standard form; typically this just involves
turning a minimization problem into a maximization problem.
Step 1: (a) Identify the Hamiltonian, and set p0 = 1. (b) For each triplet (t, x, p),
maximize H(t, x, u, p) with respect to u over U , and set uˆ = uˆ(t, x, p) to be the
maximizer.
Step 2: Find particular solutions to the ODEs, for x = x(t) and p = p(t),
x˙ = g(t, x, uˆ(t, x, p))
and
p˙ = ≠ˆxH(t, x, uˆ(t, x, p), p)
using the (1) boundary conditions and (2) the transversality condition to go from
the general solution to a particular solution. Call these particular solutions xú and
pú. Then, set uú = uú(t) = uˆ(t, xú(t), pú(t)).
Step 3: Check the sucient conditions from Mangasarian hold, i.e., (1) U is
convex and (2) H(t, x, u, pú(t)) is concave in (x, u) for all t œ [t0, t1]. If these two
things hold, (xú, uú) is an optimal pair.
For example, let’s consider the optimal control problem
min
x,u
ˆ 1
0
u2 dt
subject to
x˙ = u+ ax, x(0) = 1, x(1) = free, and u œ U = R.
Here a œ R is a fixed but arbitrary constant.
Step 0: In standard form, our problem is
max
x,u
ˆ 1
0
≠u2 dt
subject to
x˙ = u+ ax, x(0) = 1, x(1) = free, and u œ U = R.
ANALYSIS AND OPTIMIZATION 65
Step 1: The Hamiltonian with p0 = 1 is
H(t, x, u, p) = ≠u2 + p(u+ ax).
To maximize H(t, x, u, p) with respect to u œ U = R, we note that H(t, x, u, p) is
concave in u. So stationary points with respect to u are maxima. Therefore, we
compute
0 = ˆuH = ≠2u+ p.
And we find that
uˆ = p2 .
Step 2: Our ODEs are
x˙ = p2 + ax and p˙ = ≠pa.
So p = C1e≠at. Using the transversality condition, p(1) = 0 (since x(1) = free), we
find that C1 = 0. Thus,
pú(t) = 0 for all t.
Plugging this into the ODE for x, we get the simpler ODE
x˙ = ax,
which is solved by x = C2eat. Using the boundary condition, x(0) = 1, we see that
C2 = 1. And so,
xú(t) = eat.
In turn,
uú(t) = uˆ(t, xú(t), pú(t)) = p
ú(t)
2 = 0 for all t.
So our candidate solution is
(xú, uú) = (eat, 0).
Step 3: Since R is convex, U = R is convex. In addition, H(t, x, u, pú(t)) = ≠u2,
which is concave in (x, u) for all t œ [t0, t1]. So the hypothesis of Mangasarian are
satisfied, which implies that our candidate solution is an optimal solution, i.e.,
(xú, uú) = (eat, 0)
is an optimal pair / maximizing pair.
Now, let’s redo the same problem, but with a dierent terminal boundary
condition. Let’s consider the optimal control problem
max
x,u
ˆ 1
0
≠u2 dt
subject to
x˙ = u+ ax, x(0) = 1, x(1) = 0, and u œ U = R.
Here, again, a œ R is a fixed but arbitrary constant.
Step 0: Since the problem is already in standard form, there is nothing to do.
Step 1: Notice that Step 1 doesn’t consider the boundary conditions. So Step 1
is the same as before, and we have that uˆ = p/2, again.
Step 2: Here is the first place things change. While the ODEs are the same, the
particular solutions we get will change because our boundary conditions and, thus,
our transversality condition has changed. Recall that our ODEs are
x˙ = p2 + ax and p˙ = ≠pa.
66 ANALYSIS AND OPTIMIZATION
So
p = C1e≠at.
Using this p, we compute that
x(t) = C2 +
C1
2 t if a = 0
and
x(t) = C2eat ≠ C12a e
≠at if a ”= 0.
Now we use our boundary conditions x(0) = 1 and x(1) = 0. Since we have a
prescribed terminal value, there is no transversality condition. From the boundary
conditions, we find that
C2 = 1 and C1 = ≠2 if a = 0
and
C1 = ≠ 2ae
a
sinh(a) and C2 = ≠
e≠a
2 sinh(a) if a ”= 0.
Recall sinh(a) = (ea ≠ e≠a)/2. So
pú(t) = ≠2 if a = 0 and pú(t) = ≠2ae
a(1≠t)
sinh(a) if a ”= 0.
Moreover,
xú(t) = 1≠ t if a = 0 and xú(t) = sinh(a(1≠ t))sinh(a) if a ”= 0.
Finally,
uú(t) = ≠1 if a = 0 and uú(t) = ≠ae
a(1≠t)
sinh(a) if a ”= 0.
Step 3: Since pú has changed, we need to check that H(t, x, u, pú(t)) is still
concave in (x, u) for all t œ [t0, t1]. When a = 0,
H(t, x, u, pú(t)) = ≠u2 ≠ 2u.
When a ”= 0,
H(t, x, u, pú(t)) = ≠u2 ≠ 2ae
a(1≠t)
sinh(a) (u+ ax).
In (x, u) both of these functions are downward pointing parabolas, and so concave.
Thus, the hypothesis of Mangasarian are satisfied, which implies that our candidate
pair is an optimal pair.
What do we do if the Hamiltonian is not concave in (x, u)? Remember, in
Mangasarian’s theorem, one of conditions to conclude that a candidate pair is an
optimal pair is that H(t, x, u, pú(t)) is concave in (x, u) for every t œ [t0, t1]. Well,
Arrow discovered that we can conclude that a candidate pair is an optimal pair
if something called the maximized Hamiltonian is concave in x. The maximized
Hamiltonian is
Hˆ(t, x, p) = max
uœU
H(t, x, u, p) = H(t, x, uˆ, p)
assuming it exists, where, recall, uˆ = uˆ(t, x, p) is the point in U where H(t, x, u, p)
achieves its maximum when (t, x, p) are fixed.
ANALYSIS AND OPTIMIZATION 67
Theorem 7.6 (Arrow). Let (xú, uú) be an admissible pair. Suppose that 1. through
4. of the maximum principle are satisfied for some pú with p0 = 1. If the maximized
Hamiltonian Hˆ(t, x, pú(t)) is concave in x for every t œ [t0, t1], then (xú, uú) is an
optimal pair.
For example, let’s consider a specific economic growth problem:
max
s,k
ˆ 10
0
(1≠ s)k1/2 dt
subject to
k˙ = sk1/2 > 0, k(0) = 1, k(10) = free, and s œ [0, 1].
Here we see that s is the control, [0, 1] is the control region, and k is the state.
First, we identify the Hamiltonian:
H(t, x, u, p) = (1≠ s)k1/2 + psk1/2.
Second, we maximize H with respect to s, and determine sˆ and Hˆ. Notice that,
thinking of t, k, and p as fixed coecients rather than variables, we see that
H = s(p≠ 1)k1/2 + k1/2
is a linear function in s. Linear functions have one of three shapes, either they have
a positive slope, no slope, or negative slope.
Case 1: If p > 1, we see that H, just as a function of s, has positive slope. So it
is maximized at sˆ = 1.
Case 2: If p = 1, we see that H, just as a function of s, is flat. So it is maximized
at any sˆ œ [0, 1]. For our convenience, we will choose sˆ = 0 in this case.
Case 3: If p < 1, we see that H, just as a function of s, has negative slope. So it
is maximized at sˆ = 0.
So
Hˆ(t, k, p) =
I
pk1/2 for p > 1∆ sˆ = 1
k1/2 for p Æ 1∆ sˆ = 0.
Since k1/2 is concave and pk1/2 is also concave, because p Ø 0 in this case, we
have that Hˆ is concave in k for all t œ [0, 10].
Third, we have to solve our ODES, for k = k(t) and p = p(t),
k˙ = sˆk1/2 and p˙ = ≠ 12k1/2 (1≠ sˆ+ psˆ) = ≠ˆkH(t, k, sˆ, p).
The first ODE is separable, and we find that
k =
3
sˆ
2 t+ C
42
.
In turn,
k1/2 = sˆ2 t+ C.
Since k1/2 > 0, we see that
sˆ
2 t+ C > 0.
Thus,
p˙ = ≠ 1
sˆt+ 2C (1≠ sˆ+ psˆ).
68 ANALYSIS AND OPTIMIZATION
So if sˆ = 0,
p = A≠ t2C with C > 0;
where as if sˆ ”= 0,
p = 1≠ 1
sˆ
+ B˜
sˆ
1
sˆt+ 2C˜
with B˜ > 0 and sˆt+ 2C˜ > 0.
Since sˆ = 0 or sˆ = 1, this second case can be rewritten as
p = B˜
t+ 2C˜
with B˜ > 0 and t+ 2C˜ > 0,
which, again, corresponds to when sˆ = 1. Another form for k is
k =
31
2 t+ C˜
42
if p > 1∆ sˆ = 1 and k = C2 if p Æ 1∆ sˆ = 0.
Now we consider our boundary, and so transversality, conditions: k(0) = 1 and
k(10) = free. The terminal condition, k(10) = free, implies that p(10) = 0, by the
maximum principle. Recall that if p < 1, then sˆ = 0. So we have that
0 = p(10) = A≠ 102C .
In turn,
A = 5
C
.
We also know that pú has to be continuous. So for all t near 10, p(t) has to be close
to 0, which means that p(t) < 1 for all t near 10. Another thing to observe is that
p is a strictly decreasing function in t, regardless of the value of sˆ. So we know
that there is a unique time tú at which p(tú) = 1. Before tú, p(t) > 1, and after tú,
p(t) < 1. In particular, again using that pú has to be continuous, we can solve for
tú. So
1 = p(tú) = 5
C
≠ 102C ∆ t
ú = 10≠ 2C.
All in all, we see that, if t < tú
k(t) =
31
2 t+ C˜
42
and p(t) = B˜
t+ 2C˜
with B˜ > 0 and t+ 2C˜ > 0.
And if t Ø tú,
k(t) = C2 and p(t) = 5
C
≠ t2C with C > 0.
What is tú? Can tú = 0? No. If tú = 0, then 10≠ 2C = 0, i.e., C = 5. But then
k(0) = 25 ”= 1, which is impossible, we need k(0) = 1. So using the condition that
k(0) = 1, we see that C˜ = ±1. But t+ 2C˜ > 0, which implies that C˜ = 1.
Since kú and pú need to be continuous, we have to match B˜ and C at tú. Thus,
we have the equations 31
2(10≠ 2C) + 1
42
= k(tú) = C2
and
1 = p(tú) = B˜(10≠ 2C) + 2 .
ANALYSIS AND OPTIMIZATION 69
Solving these, we find that
B˜ = 6 and C = 3∆ tú = 4.
So
kú(t) =
I
( 12 t+ 1)2 if t œ [0, 4]
9 if t œ [4, 10].
Also,
pú(t) =
I
6
t+2 if t œ [0, 4]
5
3 ≠ t6 if t œ [4, 10].
And
sú(t) =
I
1 if t œ [0, 4)
0 if t œ [4, 10].
Fourth, we have to verify Arrow’s condition: Hˆ is concave in k. But we did this
before, in Step 2. So the pair (kú, sú) is an optimal pair.
7.1. Many Variables. In control theory, we can also consider the case that x and
u are vector valued, i.e., we have more than one state function and more than one
control function. Here, the problem will take the form
max
x,u
ˆ t1
t0
f(t, x, u) dt
subject to an ODE, an initial condition, and a terminal condition. Just like before.
In this case, however, the state x is vector valued:
x = x(t) = (x1(t), x2(t), . . . , xn(t)) œ Rn
and
xi : Ræ R for i = 1, . . . , n.
The control u is again vector valued:
u = u(t) = (u1(t), u2(t), . . . , ur(t)) œ U µ Rr
and
uj : Ræ R for j = 1, . . . , r.
Again, U is called the control region, but now U is a subset of Rr for some r œ N.
We remark that n need not equal r.
The ODE condition is now a family of ODEs. In compact form, it is
x˙ = g(t, x, u);
written out, we have a family of n ODEs:
x˙i(t) = gi(t, x, u) for i = 1, . . . , n,
for some given functions gi,
gi : R1+n+r æ R for i = 1, . . . , n.
The initial condition is
x(t0) = x0 œ Rn … xi(t0) = x0i for i = 1, . . . , n.
70 ANALYSIS AND OPTIMIZATION
The terminal condition is again a family of terminal condition, one for each compo-
nent of x. There are three cases, happening simultaneously,
(a)xi(t1) = x1i for i = 1, . . . , ¸,
(b)xj(t1) Ø x1j for j = ¸+ 1, . . . ,m,
(c)xk(t1) = free for k = m+ 1, . . . , n.
Remark 7.7. Comparing these cases with the 1D setting where we had four types
of terminal conditions, we note that the inequality xp(t1) Æ x1p is contained in Case
(b), after multiplying both sides by ≠1.
Definition 7.8. A pair (x, u) such that the following 1. through 5. hold is called
an admissible pair.
1. ui is piecewise continuous for all i = 1, . . . , r;
2. u(t) œ U for all t œ [t0, t1];
3. xi is continuous and piecewise dierentiable for all i = 1, . . . , n;
4. The ODE x˙ = g(t, x, u) is solved; and
5. The initial and terminal conditions are satisfied.
An optimal pair is an admissible pair that maximizes the integralˆ t1
t0
f(t, x, u) dt
among all admissible pairs.
In this world, the Hamiltonian is
H(t, x, u, p) = p0f(t, x, u) + p · g(t, x, u)
where the adjoint function is again vector valued:
p = p(t) = (p1(t), p2(t), . . . , pn(t)) œ Rn
and
pi : Ræ R for i = 1, . . . , n.
Notice that p had n components, the same number as x and g.
Theorem 7.9 (The Maximum Principle). Let (xú, uú) be an optimal pair. Then,
there is a continuous and piecewise dierentiable function pú and a number p0 œ
{0, 1}, i.e, p0 = 0 or p0 = 1, such that
1. (p0, pú(t)) ”= (0, 0) for all t œ [t0, t1].
2. uú maximizes H with respect to u, i.e.,
H(t, xú(t), u, pú(t)) Æ H(t, xú(t), uú(t), pú(t)) for all u œ U.
3. when uú is continuous, p˙úi = ≠ˆxiH(t, xú(t), uú(t), pú(t)).
4. pú satisfies the following transversality condition at t = t1, depending on
the Cases (a), (b), and (c),
(a) púi (t1) = free for i = 1, . . . , ¸,
(b) púj (t1) Ø 0 and púj (t1) = 0 if xúj (t1) > x1j for j = ¸+ 1, . . . ,m,
(c) púk(t1) = 0 for k = m+ 1, . . . , n.
Theorem 7.10 (Mangasarian and Arrow). Let (xú, uú) be an admissible pair.
Suppose that 1. through 4. of the maximum principle are satisfied for some pú and
with p0 = 1.
ANALYSIS AND OPTIMIZATION 71
1. If U , the control region, is convex and H(t, x, u, pú(t)) is concave in (x, u)
for all t œ [t0, t1], then (xú, uú) is an optimal pair.
2. If the maximized Hamiltonian Hˆ(t, x, pú(t)), defined by,
Hˆ(t, x, p) = max
uœU
H(t, x, u, p),
assuming it exists, is concave in x for every t œ [t0, t1], then (xú, uú) is an
optimal pair.
For example, let’s solve the control problem
max
x,u
ˆ 4
0
10≠ x1 + u dt
subject to
(x˙1, x˙2) = (x2, u), x(0) = (2, 4), x(4) = (free, free), and u œ [≠1, 1].
Note that n = 2 and r = 1, in this problem.
The first step is to identify the Hamiltonian with p0 = 1:
H(t, x, u, p) = 10≠ x1 + u+ p1x2 + p2u.
Step 2 is to maximize H with respect to u. Notice that H as a function of u is
linear. If p2 Ø ≠1, then H is increasing in u, so we take uˆ = 1. If p2 < ≠1, then H
is decreasing in u, so we take uˆ = ≠1. Thus,
Hˆ =
I
10≠ x1 + 1 + p1x2 + p2 if p2 Ø ≠1
10≠ x1 ≠ 1 + p1x2 ≠ p2 if p2 < ≠1.
Now we have to solve the ODEs of the problem:
(x˙1, x˙2) = (x2, uˆ) and (p˙1, p˙2) = (1,≠p1).
We see that
x2 =
I
t+ C2 if p2 Ø ≠1
≠t+ C˜2 if p2 < ≠1.
So
x1 =
I
1
2 t
2 + C2t+ C1 if p2 Ø ≠1
≠ 12 t2 + C˜2t+ C˜1 if p2 < ≠1.
Also,
p1 = t+A
and
p2 = ≠12 t
2 ≠At+B.
The transversality conditions are
x(4) = (free, free)∆ p(4) = (0, 0).
Thus,
A = ≠4 and B = ≠8.
Since p2 = ≠1 is the dividing line of the two cases, we need to find the time tú such
that p2(tú) = ≠1. We see that tú has two possibilities:
tú = 4±Ô2.
72 ANALYSIS AND OPTIMIZATION
But since the time line of the problem is [0, 4],
tú = 4≠Ô2.
Looking at p2, we see that
p2(t) Ø ≠1 if t Ø tú and p2(t) < ≠1 if t < tú.
Therefore, the initial condition goes with the case p2 < ≠1. So using the initial
condition, we find that
C˜1 = 2 and C˜2 = 4.
Hence,
x2 =
I
t+ C2 if p2 Ø ≠1
≠t+ 4 if p2 < ≠1.
So
x1 =
I
1
2 t
2 + C2t+ C1 if p2 Ø ≠1
≠ 12 t2 + 4t+ 2 if p2 < ≠1.
Finally, we need x1 and x2 to be continuous, and so have to find C1 and C2 using
the equations at tú = 4≠Ô2. We see that
C2 = 4≠ 2tú and C1 = 2≠ (tú)2 + 2tú.
Plugging these in above, we find xú. Now that we know tú, we also have uú:
uú = 1 if t Ø tú and uú = ≠1 if t < tú.
Finally, since Hˆ is linear in x, it is concave. So our pair (xú, uú) is optimal.
Let’s make some remarks about the constant p0, which appears in the Hamiltonian.
In practice, we have always taken and will always take p0 = 1. But the maximum
principle says that p0 = 0 is a possibility. If p0 = 0, then none of the conditions in
the maximum principle change if we replace our given integrand f with an arbitrary
integrand fú. This tells us that the object we are trying to maximize is irrelevant.
In particular, you can cook up an fú that would make an optimal pair out of any
candidate pair (xú, uú). This is rather silly, and an artifact of the mathematics. In
general, when dealing with a control problem coming from any physical or economic
model, p0 will equal 1.
That said, there is a scenario in which you can always guarantee that p0 = 1:
when x(t1) = free. If we have the terminal condition x(t1) = free, then, by our
transversality condition, p(t1) = 0. But since (p0, p(t1)) ”= (0, 0), by the maximum
principle, we see that p0 ”= 0, i.e, p0 = 1.
Finally, let’s do an example to demonstrate a typical argument you might use if
you wanted to show that p0 = 1. Let’s consider the problem
max
x,u
ˆ 1
0
2x≠ x2 dt
subject to
x˙ = u, x(0) = 0, x(1) = 0, and u œ [≠1, 1].
For this problem, the Hamiltonian is
H(t, x, u, p) = p0(2x≠ x2) + pu,
and the ODEs are
x˙ = u and p˙ = p0(2≠ 2x).
ANALYSIS AND OPTIMIZATION 73
If p0 = 0, then H = pu and this second ODE becomes p˙ = 0. So
p = pú = a constant.
From the maximum principle, (p0, pú(t)) ”= (0, 0) for all t œ [≠1, 1]. Hence, pú ”= 0.
Now if pú > 0, then uú(t) = 1 for all t. This is because the maximum principle tells
us that H, when considered only a function of u, is maximized by uú. Since H = púu
is linear in u with slope pú, it is maximized at the largest value in U = [≠1, 1]. From
the first ODE and the initial condition then, we find that
xú(t) = t.
But this is impossible, as the terminal condition tells that xú(1) = 0.
Similarly, if pú < 0, then uú(t) = ≠1 for all t. This is because the maximum
principle tells us that H, when considered only a function of u, is maximized by uú.
Since H = púu is linear in u with slope pú, it is maximized at the smallest value in
U = [≠1, 1]. From the first ODE and the initial condition then, we find that
xú(t) = ≠t.
But, again, this is impossible, as the terminal condition tells that xú(1) = 0.
In conclusion, pú cannot be 0, larger than 0, nor less than 0. But this is also
impossible. Therefore, our original assumption that p0 = 0 was incorrect. So p0 = 1.
7.2. Discrete Time Control Theory / Dynamic Programming. Let’s consider
a discrete version of the optimal control problem:
max
x,u
Tÿ
t=0
f(t, xt, ut)
subject to
xt+1 = g(t, xt, ut), x0 = given, and ut œ U.
In this world, our state x and control u are finite sequences, i.e., points in RT+1:
x = (x0, x1, . . . , xT ) and u = (u0, u1, . . . , uT ).
Like before U µ R is called the control region.
Notice that if we are given a control u = (u0, u1, . . . , uT ), then our state is
completely determined since x0 is given. Indeed,
x1 = g(0, x0, u0)
and the triplet (0, x0, u0) is known. With x1 in hand, the triplet (1, x1, u1) is known,
and we can produce x2:
x2 = g(1, x1, u1).
Continuing, iteratively, in this way, we construct x = (x0, x1, . . . , xT ).
The function
Tÿ
t=0
f(t, xt, ut)
is called the objective function. An important family of functions are the
(optimal) value functions, defined by
Js(x) = max
us,...,uTœU
Tÿ
t=s
f(t, xt, ut)
74 ANALYSIS AND OPTIMIZATION
where
x = xs, xt+1 = g(t, xt, ut) for t > s, and ut œ U.
These functions can be used to solve our problem given then following theorem.
Theorem 7.11. Let {Js}Ts=0 be the (optimal) value functions associated to a given
discrete time control problem. Then
JT (x) = max
uœU
f(T, x, u)
and
Js(x) = max
uœU
{f(s, x, u) + Js+1(g(s, x, u))}
for each s = 0, . . . , T ≠ 1.
How can we use this theorem to solve our problem? Well, it gives us the following
algorithm.
Step 1: Find the function uúT (x) which maximizes f(T, x, u) over u œ U with x
fixed. In other words, for each x, let uúT (x) be the point such that
JT (x) = f(T, x, uúT (x)).
With this in hand, for each x, find the point uúT≠1(x) such that
JT≠1(x) = f(T ≠ 1, x, uúT≠1(x)) + JT (g(T ≠ 1, x, uúT≠1(x))).
Continuing in this way, iteratively construct the pairs
{JT≠2(x), uúT≠2(x)}, . . . , {J0(x), uú0(x)}.
Step 2: Compute, iteratively, pairs {xút , uút } as follows:
xú0 = x0 = which is given.
Then, let
uú0 = uú0(xú0).
Then,
xú1 = g(0, xú0, uú0) and uú1 = uú1(xú1).
And so on...
Step 3: Note that
J0(x0) = max
x,u
Tÿ
t=0
f(t, xt, ut).
For example, consider the problem
max
x,u
3ÿ
t=0
(1 + xt ≠ u2t )
subject to
xt+1 = xt + ut, x0 = 0, and ut œ R.
Here, we note that f(t, x, u) = 1 + x≠ u2, g(t, x, u) = x+ u, and T = 3.
Let’s solve it.
Step 1: First, we see that
J3(x) = 1 + x and uú3(x) = 0.
ANALYSIS AND OPTIMIZATION 75
Why? Because f(3, x, u) = 1 + x≠ u2 is maximized at u = 0, when we consider x
as fixed. Then consider
J2(x) = max
u
{f(2, x, u) + J3(g(2, x, u))}
= max
u
{1 + x≠ u2 + [1 + (x+ u)]}
= max
u
{2 + 2x≠ u2 + u}.
Since h2(u) = 2 + 2x≠ u2 + u is maximized when u = 1/2, we find that
J2(x) =
9
4 + 2x and u
ú
2(x) =
1
2 .
Now consider
J1(x) = max
u
{f(1, x, u) + J2(g(1, x, u))}
= max
u
;
1 + x≠ u2 +
59
4 + 2(x+ u)
6<
= max
u
;13
4 + 3x+ 2u≠ u
2
<
.
Since h1(u) = 134 + 3x+ 2u≠ u2 is maximized when u = 1, we find that
J1(x) =
17
4 + 3x and u
ú
1(x) = 1.
Finally, consider
J0(x) = max
u
{f(0, x, u) + J1(g(0, x, u))}
= max
u
;
1 + x≠ u2 +
517
4 + 3(x+ u)
6<
= max
u
;21
4 + 4x+ 3u≠ u
2
<
.
Since h0(u) = 214 + 4x+ 3u≠ u2 is maximized when u = 3/2, we find that
J0(x) =
15
2 + 4x and u
ú
0(x) =
3
2 .
Step 2: From Step 1, we see that uút (x) are all constants functions. So
uú0 =
3
2 , u
ú
1 = 1, uú2 =
1
2 , and u
ú
3 = 0.
Since the initial condition is x0 = 0, we then compute that
xú0 = 0, xú1 = g(xú0, uú0(xú0)) =
3
2 , x
ú
2 = g(xú1, uú1(xú1)) =
5
2 ,
and
xú3 = g(xú2, uú2(xú2)) = 3.
Step 3: Since J0(x0) = J0(xú0) = 15/2, we find the maximum value of our problem
is 15/2.