Linear6tate2ptimal&ontrol3roblem with a6tochastic6witching7ime
ALESSANDRA BURATTO, LUCA GROSSET
Dipartimento di Matematica “Tullio Levi-Civita”
Università degli Studi di Padova
Via Trieste, 63 - 35121 Padova
ITAL<
Abstract: In this paper, we analyse an optimal control problem over a finite horizon with a stochastic switching
time, assuming that the two optimal control problems present in its two stages have a particularly simple form
called linear state. It is well known that linear state optimal control problems can be solved easily using the HJB
equation approach and assuming that the value function is linear in the state. Unfortunately, this simplicity of
solution does not extend to the problem with stochastic switching time. We prove that a necessary and sufficient
condition for the problem to maintain a linear state structure is to assume that the hazard rate of the switching
time depends only on the temporal variable. Finally, assuming that the hazard rate is constant, we completely
characterise the solution of the obtained linear state optimal control problem.
Key-Words: Optimal control, Regime shifts, Stochastic switching time, Linear state structure
Received: March 19, 2024. Revised: August 21, 2024. Accepted: September 13, 2024. Published: October 9, 2024.
1 Introduction
An optimal control problem with stochastic switching
time consists of a dynamic optimisation problem
divided into two stages by an event that occurs at
a random time. The switch may have one or more
simultaneous effects on the system, such as a change
in the running payoff, the salvage value function,
the state dynamics, and eventually the control set.
These types of problem, which essentially constitute
a specific case of piecewise deterministic models [1],
can be applied in many contexts, such as rational
risk, [2], renewable resources, [3], and open source
software, [4].
In the literature on dynamic optimisation, this
class of problems has been extensively studied and
two are the main methodologies applied to solve
them: the backward approach and the heterogeneous
approach. For more details on these techniques [5],
[6], [7].
In this paper, we are interested in studying the
structure of the two subproblems that define the two
stages of an optimal control problem with a stochastic
switching time. More precisely, we seek to verify
whether the general optimal control problem, which
takes into account the two stages generated from the
switch, turns out to be linear state whenever the two
subproblems are linear state in turn [1].
The search for formulations of optimal control
problems that are particularly simple to solve is
very important in applications, as it allows bypassing
analytical complexity and directly obtaining explicit
solutions that can be tested and evaluated. The most
important example is related to the formulation of
LQ optimal control problems, both deterministic and
stochastic [1]. However, this research is still ongoing;
for example, with respect to Markov chains, a recent
work by Lefener can be found in [8].
As mentioned previously, in an optimal control
problem with a stochastic switching time, it is
assumed that there exists an event that occurs at
random time τ, which abruptly changes the nature of
the system and splits the time horizon into two stages:
a Stage 1 before the occurrence of τ, and a Stage 2
afterwards.
To pursue our goal, in Section 2 we introduce
the linear state structure, drawing from the literature
on differential game theory. We then formulate an
optimal control problem with a stochastic switching
time where both stages have a linear state structure.
Furthermore, in Section 3 we reformulate the problem
as a deterministic optimal control problem equivalent
to the original stochastic one and solve it by adopting
the backward approach. We observe that, even though
the structure of the original problem is quite simple
- specifically, linear in the state and quadratic in the
control - its deterministic reformulation is not, in turn,
linear state. Finally, in Section 4 we determine the
necessary and sufficient conditions to guarantee the
linear state structure for the transformed problem.
2 Linear6tate6tructure
The class of linear state optimal control problems has
been introduced in the context of differential game
theory. Originally defined as state-separable games
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2024.23.64
Alessandra Buratto, Luca Grosset
E-ISSN: 2224-2880
608
Volume 23, 2024
by [9], furthermore, they have been denoted as linear
state games in [1].
This class consists of games for which the
state equations and the objective functions are
linear in the state variables and no (multiplicative)
interaction between control variables of one player
and state variables of the opponent is present. State-
separability is important in game theory because
these games have the property that their open-loop
Nash equilibria are Markov perfect and therefore are
subgame perfect. In addition, these problems are
“tractable” and easy to solve, even in the case of
hierarchical moves.
In our paper, we adapt the linear state definition
introduced in the game theory environment to the
optimal control problems of the two stages of an
optimal control problem with a stochastic switching
time. To comprehend where the non-linearity comes
from, for simplicity, we consider a specific and simple
linear state structure for the two subproblems. We
assume that all the functions involved are autonomous
with respect to time and switching time; furthermore,
we assume that they are linear in the states, while
the payoffs are quadratic in the controls. Finally,
we assume that there are no jumps in the state at
the switching time and that switching costs are not
considered. For a more general model the reader can
refer to [7].
In the7able, we report the functions
that characterise the problem and their notation. All
parameters take real values, and we further assume
κ1, κ2>0to guarantee the second-order optimal
sufficient conditions. With such a notation, and taking
Table 1Functions and notations
Stage 1 Stage 2
Dynamics α1x1+γ1u1α2x2+γ2u2
Payoff π1x1κ1u2
1/2 π2x2κ2u2
2/2
Salvage σ1x1σ2x2
Control sets U1=RU2=R
into account the stochasticity of τ, the switching time
optimal control problem is equivalent to maximising
the expectation of the following total payoff:
max
u1(t)U1
u2(s,t)U2
E1{τ <T }nZτ
0
π1x1(t)κ1u2
1(t)/2dt
+ZT
τ
π2x2(τ, t)κ2u2
2(τ, t)/2dt
+σ2x2(τ, T )o+
1{τT}nZT
0
π1x1(t)κ1u2
1(t)/2dt
+σ1x1(T)o(1)
subject to:
˙x1(t) = α1x1(t) + γ1u1(t), t [0, T ]
x1(0) = x0
˙x2(τ, t) = α2x2(τ, t) + γ2u2(τ, t), t [τ, T ]
x2(τ, τ) = x1(τ)
(2)
Observe that the last equation depends on the
continuity assumption for the state trajectory at τ.
The stochastic switching time τcan be modelled
as an absolutely continuous random variable taking
values in [0,+). In line with most of the related
literature [10], we introduce the so called hazard rate
of τas follows:
lim
h0+
P(τt+h
τ > t)
h=η(t, x1(t)) (3)
and assume it depending on the time and on the state
of the system.
3 Problem5eformulation
Analogously to what has been done in [11], and in
[7], we introduce an auxiliary Stage 1 state variable
z(t) := P(τ > t), that is, the probability of still being
in Stage 1 at time t, to reformulate the problem (1) in
the following deterministic form.
max
u1(t)U1
u2(s,t)U2
ZT
0
z(t)nπ1x1(t)κ1u2
1(t)/2
+η(t, x1(t))hσ2x2(t, T )
+ZT
t
π2x2(t, θ)κ2u2
2(t, θ)/2 iodt
+z(T)σ1x1(T)
(4)
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2024.23.64
Alessandra Buratto, Luca Grosset
E-ISSN: 2224-2880
609
Volume 23, 2024
subject to:
˙x1(t) = α1x1(t) + γ1u1(t), t [0, T ]
x1(0) = x0
˙x2(s, t) = α2x2(s, t) + γ2u2(s, t), t [s, T ]
x2(s, s) = x1(s)
˙z(t) = η(t, x1(t))z(t), t [0, T ]
z(0) = 1
(5)
We can observe that this resulting problem is, in
general, not trivial, even if we assumed a very simple
form in the two stages.
The aim of this paper is to verify under which
assumptions such a transformed deterministic optimal
control problem can be more tractable, from the
solving point of view. For example, a linear state
structure would make the problem easier to solve
and more consistent if included in a differential game
context [1].
Observe that in the dynamics (5) the last ODEs
(referred to z) contains a multiplicative term between
the hazard rate (depending, in general, on x1) and the
state variable zitself. Moreover, z(t)multiplies the
entire objective function within the integral.
To verify that the loss of the linear state structure
of the problem does not depend on the form of the two
subproblems, let us apply the well-known backward
approach to solve the optimal control problem with a
switching time characterised in Table 1.
First, let us solve the Stage 2 problem with
dynamic programming. Let us define the value
function V2of the second Stage
V2(t, x) :=
sup
u(θ)U2ZT
t
π2x2(θ)κ2u2
2(θ)/2 +σ2x2(T)
(6)
subject to:
˙x(θ) = α2x2(θ) + γ2u2(θ)for θ[t, T ]
x2(t) = x(7)
If V2(t, x)is differentiable, then it is the solution of
the corresponding system of HJB:
tV2(t, x) = maxwU2π2xκ2w2/2
+xV2(t, x)·(α2x+γ2w)
V2(T, x) = σ2x
(8)
The optimal feedback strategy Φ2(t, x)that
maximises the RHS of the HJB equation (20) is
degenerate, and equal to
Φ2(t, x) = γ2
κ2
xV2(t, x).(9)
Due to the linear state structure of Stage 2 problem,
the value function V2(t, x)is linear in the state.
Therefore, we assume that it has the following linear
form V2(t, x) = A2(t)x+B2(t). The two unknown
functions A2(t)and B2(t)must satisfy the following
system of decoupled ODEs.
˙
A2(t) = α2A2(t)π2
A2(T) = σ2
˙
B2(t) = γ2
2(A2(t))2/2κ2
B2(T) = 0
(10)
that admits a unique solution. In particular, for all
t(0, T ]
A2(t) = π2(eα2(Tt)1)
α2
+σ2eα2(Tt),(11)
B2(t) = ZT
t
γ2
2
2κ2
(A2(s))2ds. (12)
So that the degenerate feedback optimal control for
Stage 2 is
Φ2(t, x) = γ2
κ2
A2(t).(13)
Assuming optimal behaviour in Stage 2, and the
continuity of the state function, we obtain the
following objective for Stage 1:
max
u1(t)U1ZT
0
z(t)nπ1x1(t)κ1u2
1(t)/2
+ηt, x1(t)A2(t)x1(t) + B2(t)odt
+z(T)σ1x1(T)
(14)
subject to the following differential equations in the
variables x1(t)and z(t), for all t[0, T ]:
˙x1(t) = α1x1(t) + γ1u1(t)
x1(0) = x0
˙z(t) = ηt, x1(t)z(t)
z(0) = 1
(15)
We can observe that the problem is not linear
state due to the presence in the objective functional
of the auxiliary variable z(t)as a multiplicative
term. Moreover, the two state equations are coupled
because of the presence of the hazard rate η, which
depends on x1(t), in the ODE for z(t).
From this very simple example, we can guess
that the loss of the linear state structure comes
from the switching-time characteristic of the original
problem. In the following section, we will determine
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2024.23.64
Alessandra Buratto, Luca Grosset
E-ISSN: 2224-2880
610
Volume 23, 2024
a necessary and sufficient condition on the hazard
rate to guarantee the linear state structure for the
deterministic formulation of the optimal control
problem with a random switching time.
4 Analysis of the6tructure
We started with an optimal control problem with
a stochastic switching time in which both optimal
control problems comprising Stage 1 and Stage 2
have a particularly simple linear state structure. By
applying the backward approach and adopting the
Hamilton Jacobi Bellmann equation to solve the
Stage 2 problem, we obtained the optimal control
problem characterised by the objective functional
(14) subject to the state equations (15) that clearly
appear not to be linear state. Now, we ask
ourselves under which assumptions the obtained
problem retains the same structure as the two original
sub-problems.
In the next theorem, we provide necessary and
sufficient conditions under which (14) and (15)
constitute a linear state optimal control problem.
Theorem 1. The optimal control problem
characterised by the objective functional (14)
and the dynamics (15) is linear state if and only if the
hazard rate of the stochastic switching time does not
depend on the state function of Stage 1, i.e.
x1η(t, x1) = 0.
Proof. Let us first observe that the system of state
equations in (15) is linear in the state variables if
and only if the function ηdoes not depend on the
state variable x1. Moreover, if we assume that
η(t, x) = η(t), then the differential equation for
the state variable zis decoupled from the differential
equation for the state variable x1. The solution of the
ODE for the state variable zallows us to rewrite the
optimal control problem (14) and (15) as follows:
max
u1(t)U1ZT
0
et
0η(r)drnπ1x1(t)κ1u2
1(t)/2
+ηtA2(t)x1(t) + B2(t)odt
+eT
0η(r)drσ1x1(T)
(16)
subject to:
˙x1(t) = α1x1(t) + γ1u1(t)
x1(0) = x0
(17)
This immediately shows that the problem has the
required linear state structure.
It is worth observing that in (16) the function
z(t) = et
0η(r)dr plays the role of a discount factor,
where the hazard rate represents a variable discount
rate. Recalling, [12], in the case of non-constant
discounting, the use of standard optimal control
techniques gives rise to time-inconsistent solutions.
This interpretation suggests to apply the Hamilton-
Jacobian-Bellman approach to solve problem (16)
and (17).
Under the further assumption of a constant hazard
rate, i.e. η(t)η > 0, the analytical solution of the
problem can be easily obtained by defining the value
function for the discounted problem
v1(t, x) :=
sup
u1(t)U1ZT
t
eη(rt)nπ1x1(r)κ1u2
1(r)/2
+ηA2(r)x1(r) + B2(r)odr
+eη(Tt)σ1x1(T)(18)
subject to:
˙x1(r) = α1x1(r) + γ1u1(r)
x1(t) = x(19)
If v1(t, x)is differentiable, then as shown in [1], it is
the solution of the HJB equations:
η·v1(t, x)tv1(t, x) =
maxwU1(π1+ηA2(t))x+ηB2(t)κ1w2/2
+xv1(t, x)·(α1x+γ1w)
v1(T, x) = σ1x
(20)
If we assume that v1(t, x) = a1(t)x+b1(t), then the
two unknown functions a1(t)and b1(t)must satisfy
the following system of decoupled ODEs.
˙a1(t) = (ηα1)a1(t)π1ηA2(t)
a1(T) = σ1
˙
b1(t) = ηb1(t)ηB2(t)γ2
1a2
1(t)/2κ1
b1(T) = 0
(21)
This is a system of linear differential equations
decoupled from each other, depending on the solution
A2(t), B2(t)of (10) already solved for Stage 2 in the
previous section. The linearity of the system and the
regularity of the coefficients (which are all continuous
functions) ensure the existence and uniqueness of the
solution to this system. The optimal control for the
original problem is as follows:
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2024.23.64
Alessandra Buratto, Luca Grosset
E-ISSN: 2224-2880
611
Volume 23, 2024
u
1(t) = γ1a1(t)/κ1,
while
u
2(s, t) = γ2A2(t)/κ2.
We notice that the hazard rate affects only the optimal
control of Stage 1. Moreover, the particularly simple
form of the original problem makes the optimal
control of Stage 2 independent from the instant at
which the switch occurs.
5 Conclusion
In this paper, we analyse an optimal control problem
with a stochastic switching time. To simplify the
solution process and focus on identifying the real
reason for the loss of the linear state structure, we
assumed that both problems in Stage 1 and Stage 2
are linear state. We noticed that despite these stringent
assumptions, the initial problem fails to maintain the
linear state property. The structure is preserved only
if the hazard rate function does not depend on the state
of the system. This study highlights the difficulty
of deriving an analytical solution for optimal control
problems with stochastic switching time, even with
very basic problem data assumptions.
A future line of research suggested by this work
is the connection, which can be seen explicitly
in formula (16), between optimal control problems
with stochastic switching time and optimal control
problems with heterogeneous discounting factors,
[12]. Further studies are needed to better clarify the
connection between these two classes of problems.
References:
[1] E. Dockner, S. Jørgensen, N. Van Long, G.
Sorger, Differential Games in Economics and
Management Science, Cambridge University
Press, Cambridge, 2000.
[2] M. Kuhn, S. Wrzaczek, Rationally Risking:
A Two-Stage Approach. In (Eds.) J.L.
Haunschmied, R.M. Kovacevic, W. Semmler,
V.M. Veliov, Dynamic Modeling and
Econometrics in Economics and Finance,
Springer, Cham, 2021, pp. 85–110.
[3] S. Polasky, A. de Zeeuw, F. Wagener, Optimal
Management with Potential Regime Shifts,
Journal of Environmental Economics and
Management, Vol. 62, 2011, pp. 229–240.
[4] A. Seidl, S. Wrzaczek, Opening the Source Code:
The Threat of Forking, Journal of Dynamics and
Games, Vol. 10, 2023, pp. 121-150.
[5] D. Grass, J.P. Caulkins, G. Feichtinger, G.
Tragler, D.A. Behrens, Optimal Control of
Nonlinear Processes, with Applications in Drugs,
Corruption, and Terror, Springer, Berlin, 2008.
[6] S. Wrzaczek, M. Kuhn, I. Frankovic, Using
Age Structure for a Multi-stage Optimal Control
Model with Random Switching Time, Journal of
Optimization Theory and Applications, Vol. 184,
2022, pp. 1065–1082.
[7] A. Buratto, L. Grosset, M. Muttoni, Two
Different Solution Techniques for an Optimal
Control Problem with a Stochastic Switching
Time, WSEAS Transactions on Mathematics,
Vol.22, 2023, pp. 730-735.
[8] M. Lefebvre, An Explicit Solution to a Discrete-
time Stochastic Optimal Control Problem,
WSEAS Transactions on Systems, Vol. 22, 2023,
pp. 368-371.
[9] E. Dockner, G. Feichtinger, S. Jørgensen,
Tractable classes of nonzero-sum open-loop
Nash differential games: Theory and examples.
Journal of Optimization Theory and Application,
Vol. 45, 1985, pp. 179–197.
[10] N. Van Long, Managing, Inducing, and
Preventing Regime Shifts: A Review of the
Literature. In (Eds.) J.L. Haunschmied, R.M.
Kovacevic, W. Semmler, V.M. Veliov, Dynamic
Modeling and Econometrics in Economics and
Finance, Springer, Cham, 2021, pp. 1–36.
[11] E.K. Boukas, A. Haurie, P. Michael, An optimal
control problem with a random stopping time,
Journal of Optimization Theory and Application,
Vol. 64:3, 1990, pp. 471-480.
[12] J. Marín-Solano, C. Patxot, Heterogeneous
discounting in economic problems. Optimal
Control Application and Methods, Vol.33, 2012,
pp. 32-50.
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2024.23.64
Alessandra Buratto, Luca Grosset
E-ISSN: 2224-2880
612
Volume 23, 2024
Sources of Funding for Research Presented in
a Scientific Article or Scientific Article Itself
No funding was received for conducting this
study.
Conflict of Interest
The authors have no conflicts of interest to
declare.
Creative Commons Attribution License 4.0
(Attribution 4.0 International, CC BY 4.0)
This article is published under the terms of the
Creative Commons Attribution License 4.0
https://creativecommons.org/licenses/by/4.0/dee
d.en_US
the creation of a scientific article
(ghostwriting policy)
The theoretical framework was developed by
Alessandra Buratto and Luca Grosset. Alessandra
Buratto authored the Introduction, Section 2, and
Section 3, while Luca Grosset was responsible for
Section 4 and the Conclusion.
Contribution of Individual Authors to the
Creation of a Scientific Article (Ghostwriting
Policy)
WSEAS TRANSACTIONS on MATHEMATICS
DOI: 10.37394/23206.2024.23.64
Alessandra Buratto, Luca Grosset
E-ISSN: 2224-2880
613
Volume 23, 2024