Linear6tate2ptimal&ontrol3roblem with a6tochastic6witching7ime

ALESSANDRA BURATTO, LUCA GROSSET

Dipartimento di Matematica “Tullio Levi-Civita”

Università degli Studi di Padova

Via Trieste, 63 - 35121 Padova

ITAL<

Abstract: In this paper, we analyse an optimal control problem over a finite horizon with a stochastic switching

time, assuming that the two optimal control problems present in its two stages have a particularly simple form

called linear state. It is well known that linear state optimal control problems can be solved easily using the HJB

equation approach and assuming that the value function is linear in the state. Unfortunately, this simplicity of

solution does not extend to the problem with stochastic switching time. We prove that a necessary and sufficient

condition for the problem to maintain a linear state structure is to assume that the hazard rate of the switching

time depends only on the temporal variable. Finally, assuming that the hazard rate is constant, we completely

characterise the solution of the obtained linear state optimal control problem.

Key-Words: Optimal control, Regime shifts, Stochastic switching time, Linear state structure

Received: March 19, 2024. Revised: August 21, 2024. Accepted: September 13, 2024. Published: October 9, 2024.

1 Introduction

An optimal control problem with stochastic switching

time consists of a dynamic optimisation problem

divided into two stages by an event that occurs at

a random time. The switch may have one or more

simultaneous effects on the system, such as a change

in the running payoff, the salvage value function,

the state dynamics, and eventually the control set.

These types of problem, which essentially constitute

a specific case of piecewise deterministic models [1],

can be applied in many contexts, such as rational

risk, [2], renewable resources, [3], and open source

software, [4].

In the literature on dynamic optimisation, this

class of problems has been extensively studied and

two are the main methodologies applied to solve

them: the backward approach and the heterogeneous

approach. For more details on these techniques [5],

[6], [7].

In this paper, we are interested in studying the

structure of the two subproblems that define the two

stages of an optimal control problem with a stochastic

switching time. More precisely, we seek to verify

whether the general optimal control problem, which

takes into account the two stages generated from the

switch, turns out to be linear state whenever the two

subproblems are linear state in turn [1].

The search for formulations of optimal control

problems that are particularly simple to solve is

very important in applications, as it allows bypassing

analytical complexity and directly obtaining explicit

solutions that can be tested and evaluated. The most

important example is related to the formulation of

LQ optimal control problems, both deterministic and

stochastic [1]. However, this research is still ongoing;

for example, with respect to Markov chains, a recent

work by Lefener can be found in [8].

As mentioned previously, in an optimal control

problem with a stochastic switching time, it is

assumed that there exists an event that occurs at

random time τ, which abruptly changes the nature of

the system and splits the time horizon into two stages:

a Stage 1 before the occurrence of τ, and a Stage 2

afterwards.

To pursue our goal, in Section 2 we introduce

the linear state structure, drawing from the literature

on differential game theory. We then formulate an

optimal control problem with a stochastic switching

time where both stages have a linear state structure.

Furthermore, in Section 3 we reformulate the problem

as a deterministic optimal control problem equivalent

to the original stochastic one and solve it by adopting

the backward approach. We observe that, even though

the structure of the original problem is quite simple

- specifically, linear in the state and quadratic in the

control - its deterministic reformulation is not, in turn,

linear state. Finally, in Section 4 we determine the

necessary and sufficient conditions to guarantee the

linear state structure for the transformed problem.

2 Linear6tate6tructure

The class of linear state optimal control problems has

been introduced in the context of differential game

theory. Originally defined as state-separable games

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2024.23.64

Alessandra Buratto, Luca Grosset

E-ISSN: 2224-2880

608

Volume 23, 2024

by [9], furthermore, they have been denoted as linear

state games in [1].

This class consists of games for which the

state equations and the objective functions are

linear in the state variables and no (multiplicative)

interaction between control variables of one player

and state variables of the opponent is present. State-

separability is important in game theory because

these games have the property that their open-loop

Nash equilibria are Markov perfect and therefore are

subgame perfect. In addition, these problems are

“tractable” and easy to solve, even in the case of

hierarchical moves.

In our paper, we adapt the linear state definition

introduced in the game theory environment to the

optimal control problems of the two stages of an

optimal control problem with a stochastic switching

time. To comprehend where the non-linearity comes

from, for simplicity, we consider a specific and simple

linear state structure for the two subproblems. We

assume that all the functions involved are autonomous

with respect to time and switching time; furthermore,

we assume that they are linear in the states, while

the payoffs are quadratic in the controls. Finally,

we assume that there are no jumps in the state at

the switching time and that switching costs are not

considered. For a more general model the reader can

refer to [7].

In the7able, we report the functions

that characterise the problem and their notation. All

parameters take real values, and we further assume

κ1, κ2>0to guarantee the second-order optimal

sufficient conditions. With such a notation, and taking

Table 1Functions and notations

Stage 1 Stage 2

Dynamics α1x1+γ1u1α2x2+γ2u2

Payoff π1x1−κ1u2

1/2 π2x2−κ2u2

2/2

Salvage σ1x1σ2x2

Control sets U1=RU2=R

into account the stochasticity of τ, the switching time

optimal control problem is equivalent to maximising

the expectation of the following total payoff:

max

u1(t)∈U1

u2(s,t)∈U2

E1{τ <T }nZτ

π1x1(t)−κ1u2

1(t)/2dt

+ZT

π2x2(τ, t)−κ2u2

2(τ, t)/2dt

+σ2x2(τ, T )o+

1{τ≥T}nZT

π1x1(t)−κ1u2

1(t)/2dt

+σ1x1(T)o(1)

subject to:











˙x1(t) = α1x1(t) + γ1u1(t), t ∈[0, T ]

x1(0) = x0

˙x2(τ, t) = α2x2(τ, t) + γ2u2(τ, t), t ∈[τ, T ]

x2(τ, τ) = x1(τ)

(2)

Observe that the last equation depends on the

continuity assumption for the state trajectory at τ.

The stochastic switching time τcan be modelled

as an absolutely continuous random variable taking

values in [0,+∞). In line with most of the related

literature [10], we introduce the so called hazard rate

of τas follows:

lim

h→0+

P(τ≤t+h

τ > t)

h=η(t, x1(t)) (3)

and assume it depending on the time and on the state

of the system.

3 Problem5eformulation

Analogously to what has been done in [11], and in

[7], we introduce an auxiliary Stage 1 state variable

z(t) := P(τ > t), that is, the probability of still being

in Stage 1 at time t, to reformulate the problem (1) in

the following deterministic form.

max

u1(t)∈U1

u2(s,t)∈U2

ZT

z(t)nπ1x1(t)−κ1u2

1(t)/2

+η(t, x1(t))hσ2x2(t, T )

+ZT

π2x2(t, θ)−κ2u2

2(t, θ)/2 dθiodt

+z(T)σ1x1(T)

(4)

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2024.23.64

Alessandra Buratto, Luca Grosset

E-ISSN: 2224-2880

609

Volume 23, 2024

subject to:











˙x1(t) = α1x1(t) + γ1u1(t), t ∈[0, T ]

x1(0) = x0

˙x2(s, t) = α2x2(s, t) + γ2u2(s, t), t ∈[s, T ]

x2(s, s) = x1(s)

˙z(t) = −η(t, x1(t))z(t), t ∈[0, T ]

z(0) = 1

(5)

We can observe that this resulting problem is, in

general, not trivial, even if we assumed a very simple

form in the two stages.

The aim of this paper is to verify under which

assumptions such a transformed deterministic optimal

control problem can be more tractable, from the

solving point of view. For example, a linear state

structure would make the problem easier to solve

and more consistent if included in a differential game

context [1].

Observe that in the dynamics (5) the last ODEs

(referred to z) contains a multiplicative term between

the hazard rate (depending, in general, on x1) and the

state variable zitself. Moreover, z(t)multiplies the

entire objective function within the integral.

To verify that the loss of the linear state structure

of the problem does not depend on the form of the two

subproblems, let us apply the well-known backward

approach to solve the optimal control problem with a

switching time characterised in Table 1.

First, let us solve the Stage 2 problem with

dynamic programming. Let us define the value

function V2of the second Stage

V2(t, x) :=

sup

u(θ)∈U2ZT

π2x2(θ)−κ2u2

2(θ)/2 dθ +σ2x2(T)

(6)

subject to:

˙x(θ) = α2x2(θ) + γ2u2(θ)for θ∈[t, T ]

x2(t) = x(7)

If V2(t, x)is differentiable, then it is the solution of

the corresponding system of HJB:







−∂tV2(t, x) = maxw∈U2π2x−κ2w2/2

+∂xV2(t, x)·(α2x+γ2w)

V2(T, x) = σ2x

(8)

The optimal feedback strategy Φ2(t, x)that

maximises the RHS of the HJB equation (20) is

degenerate, and equal to

Φ2(t, x) = γ2

κ2

∂xV2(t, x).(9)

Due to the linear state structure of Stage 2 problem,

the value function V2(t, x)is linear in the state.

Therefore, we assume that it has the following linear

form V2(t, x) = A2(t)x+B2(t). The two unknown

functions A2(t)and B2(t)must satisfy the following

system of decoupled ODEs.











A2(t) = −α2A2(t)−π2

A2(T) = σ2

B2(t) = −γ2

2(A2(t))2/2κ2

B2(T) = 0

(10)

that admits a unique solution. In particular, for all

t∈(0, T ]

A2(t) = π2(eα2(T−t)−1)

α2

+σ2eα2(T−t),(11)

B2(t) = ZT

γ2

2κ2

(A2(s))2ds. (12)

So that the degenerate feedback optimal control for

Stage 2 is

Φ2(t, x) = γ2

κ2

A2(t).(13)

Assuming optimal behaviour in Stage 2, and the

continuity of the state function, we obtain the

following objective for Stage 1:

max

u1(t)∈U1ZT

z(t)nπ1x1(t)−κ1u2

1(t)/2

+ηt, x1(t)A2(t)x1(t) + B2(t)odt

+z(T)σ1x1(T)

(14)

subject to the following differential equations in the

variables x1(t)and z(t), for all t∈[0, T ]:











˙x1(t) = α1x1(t) + γ1u1(t)

x1(0) = x0

˙z(t) = −ηt, x1(t)z(t)

z(0) = 1

(15)

We can observe that the problem is not linear

state due to the presence in the objective functional

of the auxiliary variable z(t)as a multiplicative

term. Moreover, the two state equations are coupled

because of the presence of the hazard rate η, which

depends on x1(t), in the ODE for z(t).

From this very simple example, we can guess

that the loss of the linear state structure comes

from the switching-time characteristic of the original

problem. In the following section, we will determine

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2024.23.64

Alessandra Buratto, Luca Grosset

E-ISSN: 2224-2880

610

Volume 23, 2024

a necessary and sufficient condition on the hazard

rate to guarantee the linear state structure for the

deterministic formulation of the optimal control

problem with a random switching time.

4 Analysis of the6tructure

We started with an optimal control problem with

a stochastic switching time in which both optimal

control problems comprising Stage 1 and Stage 2

have a particularly simple linear state structure. By

applying the backward approach and adopting the

Hamilton Jacobi Bellmann equation to solve the

Stage 2 problem, we obtained the optimal control

problem characterised by the objective functional

(14) subject to the state equations (15) that clearly

appear not to be linear state. Now, we ask

ourselves under which assumptions the obtained

problem retains the same structure as the two original

sub-problems.

In the next theorem, we provide necessary and

sufficient conditions under which (14) and (15)

constitute a linear state optimal control problem.

Theorem 1. The optimal control problem

characterised by the objective functional (14)

and the dynamics (15) is linear state if and only if the

hazard rate of the stochastic switching time does not

depend on the state function of Stage 1, i.e.

∂x1η(t, x1) = 0.

Proof. Let us first observe that the system of state

equations in (15) is linear in the state variables if

and only if the function ηdoes not depend on the

state variable x1. Moreover, if we assume that

η(t, x) = η(t), then the differential equation for

the state variable zis decoupled from the differential

equation for the state variable x1. The solution of the

ODE for the state variable zallows us to rewrite the

optimal control problem (14) and (15) as follows:

max

u1(t)∈U1ZT

e−∫t

0η(r)drnπ1x1(t)−κ1u2

1(t)/2

+ηtA2(t)x1(t) + B2(t)odt

+e−∫T

0η(r)drσ1x1(T)

(16)

subject to:

˙x1(t) = α1x1(t) + γ1u1(t)

x1(0) = x0

(17)

This immediately shows that the problem has the

required linear state structure.

It is worth observing that in (16) the function

z(t) = e−∫t

0η(r)dr plays the role of a discount factor,

where the hazard rate represents a variable discount

rate. Recalling, [12], in the case of non-constant

discounting, the use of standard optimal control

techniques gives rise to time-inconsistent solutions.

This interpretation suggests to apply the Hamilton-

Jacobian-Bellman approach to solve problem (16)

and (17).

Under the further assumption of a constant hazard

rate, i.e. η(t)≡η > 0, the analytical solution of the

problem can be easily obtained by defining the value

function for the discounted problem

v1(t, x) :=

sup

u1(t)∈U1ZT

e−η(r−t)nπ1x1(r)−κ1u2

1(r)/2

+ηA2(r)x1(r) + B2(r)odr

+e−η(T−t)σ1x1(T)(18)

subject to:

˙x1(r) = α1x1(r) + γ1u1(r)

x1(t) = x(19)

If v1(t, x)is differentiable, then as shown in [1], it is

the solution of the HJB equations:











η·v1(t, x)−∂tv1(t, x) =

maxw∈U1(π1+ηA2(t))x+ηB2(t)−κ1w2/2

+∂xv1(t, x)·(α1x+γ1w)

v1(T, x) = σ1x

(20)

If we assume that v1(t, x) = a1(t)x+b1(t), then the

two unknown functions a1(t)and b1(t)must satisfy

the following system of decoupled ODEs.











˙a1(t) = (η−α1)a1(t)−π1−ηA2(t)

a1(T) = σ1

b1(t) = ηb1(t)−ηB2(t)−γ2

1a2

1(t)/2κ1

b1(T) = 0

(21)

This is a system of linear differential equations

decoupled from each other, depending on the solution

A2(t), B2(t)of (10) already solved for Stage 2 in the

previous section. The linearity of the system and the

regularity of the coefficients (which are all continuous

functions) ensure the existence and uniqueness of the

solution to this system. The optimal control for the

original problem is as follows:

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2024.23.64

Alessandra Buratto, Luca Grosset

E-ISSN: 2224-2880

611

Volume 23, 2024

u∗

1(t) = γ1a1(t)/κ1,

while

u∗

2(s, t) = γ2A2(t)/κ2.

We notice that the hazard rate affects only the optimal

control of Stage 1. Moreover, the particularly simple

form of the original problem makes the optimal

control of Stage 2 independent from the instant at

which the switch occurs.

5 Conclusion

In this paper, we analyse an optimal control problem

with a stochastic switching time. To simplify the

solution process and focus on identifying the real

reason for the loss of the linear state structure, we

assumed that both problems in Stage 1 and Stage 2

are linear state. We noticed that despite these stringent

assumptions, the initial problem fails to maintain the

linear state property. The structure is preserved only

if the hazard rate function does not depend on the state

of the system. This study highlights the difficulty

of deriving an analytical solution for optimal control

problems with stochastic switching time, even with

very basic problem data assumptions.

A future line of research suggested by this work

is the connection, which can be seen explicitly

in formula (16), between optimal control problems

with stochastic switching time and optimal control

problems with heterogeneous discounting factors,

[12]. Further studies are needed to better clarify the

connection between these two classes of problems.

References:

[1] E. Dockner, S. Jørgensen, N. Van Long, G.

Sorger, Differential Games in Economics and

Management Science, Cambridge University

Press, Cambridge, 2000.

[2] M. Kuhn, S. Wrzaczek, Rationally Risking:

A Two-Stage Approach. In (Eds.) J.L.

Haunschmied, R.M. Kovacevic, W. Semmler,

V.M. Veliov, Dynamic Modeling and

Econometrics in Economics and Finance,

Springer, Cham, 2021, pp. 85–110.

[3] S. Polasky, A. de Zeeuw, F. Wagener, Optimal

Management with Potential Regime Shifts,

Journal of Environmental Economics and

Management, Vol. 62, 2011, pp. 229–240.

[4] A. Seidl, S. Wrzaczek, Opening the Source Code:

The Threat of Forking, Journal of Dynamics and

Games, Vol. 10, 2023, pp. 121-150.

[5] D. Grass, J.P. Caulkins, G. Feichtinger, G.

Tragler, D.A. Behrens, Optimal Control of

Nonlinear Processes, with Applications in Drugs,

Corruption, and Terror, Springer, Berlin, 2008.

[6] S. Wrzaczek, M. Kuhn, I. Frankovic, Using

Age Structure for a Multi-stage Optimal Control

Model with Random Switching Time, Journal of

Optimization Theory and Applications, Vol. 184,

2022, pp. 1065–1082.

[7] A. Buratto, L. Grosset, M. Muttoni, Two

Different Solution Techniques for an Optimal

Control Problem with a Stochastic Switching

Time, WSEAS Transactions on Mathematics,

Vol.22, 2023, pp. 730-735.

[8] M. Lefebvre, An Explicit Solution to a Discrete-

time Stochastic Optimal Control Problem,

WSEAS Transactions on Systems, Vol. 22, 2023,

pp. 368-371.

[9] E. Dockner, G. Feichtinger, S. Jørgensen,

Tractable classes of nonzero-sum open-loop

Nash differential games: Theory and examples.

Journal of Optimization Theory and Application,

Vol. 45, 1985, pp. 179–197.

[10] N. Van Long, Managing, Inducing, and

Preventing Regime Shifts: A Review of the

Literature. In (Eds.) J.L. Haunschmied, R.M.

Kovacevic, W. Semmler, V.M. Veliov, Dynamic

Modeling and Econometrics in Economics and

Finance, Springer, Cham, 2021, pp. 1–36.

[11] E.K. Boukas, A. Haurie, P. Michael, An optimal

control problem with a random stopping time,

Journal of Optimization Theory and Application,

Vol. 64:3, 1990, pp. 471-480.

[12] J. Marín-Solano, C. Patxot, Heterogeneous

discounting in economic problems. Optimal

Control Application and Methods, Vol.33, 2012,

pp. 32-50.

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2024.23.64

Alessandra Buratto, Luca Grosset

E-ISSN: 2224-2880

612

Volume 23, 2024

Sources of Funding for Research Presented in

a Scientific Article or Scientific Article Itself

No funding was received for conducting this

study.

Conflict of Interest

The authors have no conflicts of interest to

declare.

Creative Commons Attribution License 4.0

(Attribution 4.0 International, CC BY 4.0)

This article is published under the terms of the

Creative Commons Attribution License 4.0

https://creativecommons.org/licenses/by/4.0/dee

d.en_US

the creation of a scientific article

(ghostwriting policy)

The theoretical framework was developed by

Alessandra Buratto and Luca Grosset. Alessandra

Buratto authored the Introduction, Section 2, and

Section 3, while Luca Grosset was responsible for

Section 4 and the Conclusion.

Contribution of Individual Authors to the

Creation of a Scientific Article (Ghostwriting

Policy)

WSEAS TRANSACTIONS on MATHEMATICS

DOI: 10.37394/23206.2024.23.64

Alessandra Buratto, Luca Grosset

E-ISSN: 2224-2880

613

Volume 23, 2024