Optimal control: Systems analysis

Problem statement

Con­trol­ling time-depen­dent sys­tems is a chal­len­ging endea­vor, and suc­cess can­not be gua­ran­teed under all cir­cum­s­tances. Some sys­tems are intrin­si­cal­ly unsta­ble and can spi­ral out of con­trol regard­less of the con­trol input.

Other sys­tems do not pro­vi­de the decis­i­on-making algo­rithm with enough infor­ma­ti­on to deri­ve meaningful con­trol instructions.

To esti­ma­te the suc­cess of con­trol, it is neces­sa­ry to exami­ne key sys­tem cha­rac­te­ristics such as sta­bi­li­ty, con­troll­a­bi­li­ty, and obser­va­bi­li­ty. Only then can the poten­ti­al pro­blems in con­trol­ling the sys­tem be iden­ti­fied, and limits of vali­di­ty as well as gua­ran­tees of suc­cess be estab­lished. In prac­ti­ce, this theo­re­ti­cal sys­tem ana­ly­sis is fur­ther com­pli­ca­ted by unknown dyna­mic rela­ti­onships that must be deri­ved from obser­va­tio­nal data and preli­mi­na­ry con­side­ra­ti­ons befo­re any fur­ther investigation.

Figu­re 1: Illus­tra­ti­on of sys­tem features.

Exemplary system behavior

The given equa­ti­on descri­bes a dam­ped har­mo­nic oscillator:

$$ \ddot{x} + \frac{c}{m} \dot{x} + \frac{k}{m}x = 0 $$

This equa­ti­on models the posi­ti­on \(x\), velo­ci­ty \(\dot{x}\), and acce­le­ra­ti­on \(\ddot{x}\) of an ela­s­ti­cal­ly oscil­la­ting mass \(m\) in a typi­cal­ly vis­cous flu­id with a dam­ping coef­fi­ci­ent \(c\). The para­me­ter \(k\) is the spring con­stant, which cou­ples the dis­pla­ce­ment and acceleration.

Figu­re 2: The dam­pe­ning coef­fi­ci­ent deter­mi­nes the sta­bi­li­ty of the system.

Exemplary systems analysis

The qua­li­ta­ti­ve dif­fe­ren­ces in sys­tem beha­vi­or can be explai­ned and pre­dic­ted direct­ly. The dif­fe­ren­ti­al equa­ti­on for the dam­ped oscil­la­tor can also be writ­ten as

$$ v=\begin{bmatrix} v_1 \\ v_2 \end{bmatrix}= \begin{bmatrix} x \\ \dot{x}\end{bmatrix}, ~~~ \dot{v} = \begin{bmatrix} \dot{x}\\ \ddot{x}\end{bmatrix} = \begin{bmatrix} v_2 \\ -(c/m)v_2 -(k/m) v_1\end{bmatrix} = \begin{bmatrix} 0 & 1 \\ -(k/m) & -(c/m)\end{bmatrix} \begin{bmatrix} v_1 \\ v_2 \end{bmatrix} = Av $$

whe­re \(\dot{v}\) is the time deri­va­ti­ve of \(v\). From the equa­ti­on \(\dot{v}=Av\), pro­ble­ma­tic con­fi­gu­ra­ti­ons can be direct­ly dedu­ced: For exam­p­le, if the­re exists a \(v_0\) such that \(Av_0=\lambda v_0\) and \(\lambda > 0\), then for \(v_0=v(0)\) it holds that

\begin{align}  ~~~~~~\dot{v}(0) & = \lambda v(0) \\ \Rightarrow v(\Delta t) & \approx (1+\lambda \Delta t)v(0) \\ \Rightarrow v(2\Delta t) &\approx (1+\lambda \Delta t)v(\Delta t) \\ & \vdots \\ \Rightarrow v(n\Delta t) &\approx (1+\lambda \Delta t) v((n‑1)\Delta t) \end{align}

such that the sta­te vec­tor \(v(t)=[x(t), \dot{x}(t)]\) grows slight­ly with each time step and thus grows towards infi­ni­ty. Howe­ver, if for every \(v\in \mathbb{R}^2\), \(v\) is redu­ced in magni­tu­de by \(v+\Delta t A v\), then the sys­tem con­ver­ges to \([0,0]\) for all initi­al sta­tes and sta­bi­li­zes its­elf. The ques­ti­on of sta­bi­li­ty is the­r­e­fo­re pri­ma­ri­ly a ques­ti­on of the signs of the eigenva­lues of matrix \(A\); if all are nega­ti­ve, the sys­tem is sta­ble [1, p. 24].

Equi­va­lent to this cri­ter­ion is the line­ar (Lyapu­n­ov) matrix ine­qua­li­ty [2]

$$ P \succeq 0 ~~~~A^TP+PA \preceq 0, $$

which is only sol­va­ble in the case of a sta­ble sys­tem. Bes­i­des con­troll­a­bi­li­ty and obser­va­bi­li­ty, the­re are other pro­per­ties such as sta­bi­liza­bi­li­ty and detec­ta­bi­li­ty, who­se pre­sence gua­ran­tees the exis­tence of an opti­mal con­stant feed­back loop \(u=Kx\).

Systems with control signal

When a sys­tem can be actively influen­ced with a con­trol signal \(u\), the situa­ti­on beco­mes more com­pli­ca­ted. It must be inves­ti­ga­ted whe­ther a \(p\)-dimensional con­trol signal \(u\) that sta­bi­li­zes the sys­tem exists and whe­ther it can be deri­ved from pos­si­bly defi­ci­ent observations.

The sketch on the right illus­tra­tes the rela­ti­onships: A sta­te \(v\) is incom­ple­te­ly obser­ved. This leads to the mea­su­re­ment value \(y\), which must be pro­ces­sed into a con­trol signal. The sta­te, con­trol signal, and sta­te chan­ge are inter­con­nec­ted through the matri­ces \(A, B\).

Drawing_system_analysis_3

The abo­ve rela­ti­onships are more com­pact­ly for­mu­la­ted mathe­ma­ti­cal­ly as the line­ar sys­tem \((A, B, C)\):

\begin{align} \dot{v} & = Av(t) + B u(t), ~~~~~&&v\in \mathbb{R}^n, ~~ A\in \mathbb{R}^{n\times n}, ~~B\in \mathbb{R}^{n\times p} \\ y(t)&=Cv(t), && y\in \mathbb{R}^m, ~~ C \in \mathbb{R}^{m\times n} \end{align}

with matri­ces \(A, B, C\). It media­tes bet­ween sta­te vec­tors, con­trol signals, and observations.

Controllability and observability

The sys­tem can be ana­ly­zed for con­troll­a­bi­li­ty and the suf­fi­ci­en­cy of obser­va­tions \(y\) for con­trol using line­ar matrix ine­qua­li­ties. It is sta­ted in [2] that:

  • Eit­her the sys­tem \((A,B,I)\) is con­troll­able, or the­re exists a sym­me­tric matrix \(P \neq 0\) such that $$ AP+PA^T \preceq 0, ~~PB=0.$$
  • Eit­her the sys­tem \((A, B, C)\) is obser­va­ble, or the­re exists a sym­me­tric matrix \(P \neq 0\) such that $$ AP+PA^T \preceq 0, ~~C^TP=0.$$

From the sol­va­bi­li­ty of semi­de­fi­ni­te pro­grams, glo­bal sys­tem pro­per­ties can be derived.

Optimal control signals

For a sys­tem of the form \(\dot{x} = Ax + Bu\) with direct­ly obser­va­ble sta­tes \(x\), con­trol signals can be direct­ly deri­ved from the sta­te obser­va­tions. The qua­dra­tic cost func­tion \(\int_0^{\infty}u(t)^TRu(t)+x(t)^TQx(t) dt\) is to be mini­mi­zed. This cost func­tion con­sists of cos­ts for the exer­ti­on of the con­trol signal \(u(t)\) and cos­ts for the devia­ti­on of the sta­te \(x(t)\) from the desi­red sta­ble sta­te \(x=0\).

It fol­lows that \(u = Kx\) with \(K\) being a matrix that maps sta­tes to con­trol signals and satis­fies the fol­lo­wing matrix equa­tions [3, pp. 35–40]:

\begin{align} K &= ‑R^{-1}B^TP \\ 0 &= A^TP + PA — PBR^{-1}B^TP + Q \\ P & \succeq 0 \end{align}

The equa­ti­on for the matrix \(P\) is known as the alge­braic Ric­ca­ti equa­ti­on and is qua­dra­tic in \(P\). The search for an \(K = ‑R^{-1}B^TP\) that opti­mal­ly con­trols the sys­tem can be for­mu­la­ted as a semi­de­fi­ni­te optimi­zation pro­blem in the matrix varia­ble \(P\) [4, p. 9]:

$$ \begin{align} \min_P ~~~& -\operatorname{tr} P \\ \text{s.t.} ~~~&\begin{bmatrix} A^TP + PA + Q & PB \\ B^TP & R \end{bmatrix} \succeq 0 \\ ~~~& P \text{ sym­me­tric} \end{align}$$

Applications

Given the dam­ped har­mo­nic oscil­la­tor now par­ti­al­ly con­troll­able with a con­trol signal \(u\) accor­ding to the equation

\begin{align}
\dot{x} &= Ax + Bu \\
A &= \begin{bmatrix} 0 & 1 \\ ‑1 & ‑0.2 \end{bmatrix}, ~~~~~ B = \begin{bmatrix} 0 \\ 1 \end{bmatrix},
\end{align}

whe­re \(x=[\text{Position, Velo­ci­ty}]\). Then, under the cons­traints \(PB=0\) and \(A^TP+PA \preceq 0\) with \(P\) sym­me­tric, it direct­ly fol­lows that \(\max (\operatorname{tr} P )=\min( \operatorname{tr} P)\) with the opti­mal \(P^*=0\) and the sys­tem is con­troll­able. Simul­ta­neous­ly, \(PC^T=0\) and \(AP+PA^T \preceq 0\) with \(P\) sym­me­tric and \(\max (\operatorname{tr} P )=\min( \operatorname{tr} P)\) with \(P^*=0\), and the sys­tem is obser­va­ble. The solu­ti­on of the semi­de­fi­ni­te program

$$ \begin{align} \min_P ~~~& -\operatorname{tr} P \\ \text{s.t.} ~~~&\begin{bmatrix} A^TP+PA+I & PB \\ B^TP & 1\end{bmatrix} \succeq 0 \\ ~~~& P \text{ sym­me­tric} \end{align}$$

is \(P^*=[1.73, 0.414; 0.414, 1.167]\) with \(K=-B^TP=[-0.414, — 1.167]\). With the con­trol signal \(u^*=Kx\), the cost func­tion­al \(\int_{0}^{\infty} x^2(t)+\dot{x}(t)^2 + u^2(t) dt\) is minimized.

Figu­re 3: The con­trol signal \(u\), opti­mal­ly sel­ec­ted through SDP, sta­bi­li­zes the inher­ent­ly slow­ly con­ver­ging har­mo­nic oscil­la­tor quick­ly (a), (b) to ©, (d). The tem­po­ral pro­gres­si­ons of \(x\) are shown again.

Practical aspects

If the matri­ces \(A\) and \(B\) are time-depen­dent, then the feed­back matrix \(K\) is also time-depen­dent, and \(P\) satis­fies a dif­fe­ren­ti­al equa­ti­on. If the rela­ti­onships bet­ween \(x\), \(u\), and \(\dot{x}\) are non­line­ar, opti­mal con­trol might still be achie­va­ble. It should be inves­ti­ga­ted whe­ther the effects of non­linea­ri­ty can be boun­ded by line­ar ine­qua­li­ties. The same con­side­ra­ti­ons for non­linea­ri­ty also app­ly to ran­dom effects.

In prac­ti­ce, mode­ling a real-world phe­no­me­non as a line­ar sys­tem \(\dot{x}=Ax+Bu\) is also chal­len­ging. This often requi­res hig­her-dimen­sio­nal embed­dings and linea­riza­ti­ons, and despi­te all tricks, mode­ling might fail.

Due to the com­ple­xi­ty of the real world, non­linea­ri­ty is a com­mon com­pli­ca­ting fac­tor that sof­tens gua­ran­tees of con­troll­a­bi­li­ty and neces­si­ta­tes the intro­duc­tion of expe­ri­men­tal methods like rein­force­ment lear­ning. None­thel­ess, line­ar models and their ana­ly­sis remain useful tools for inves­ti­ga­ting espe­ci­al­ly tech­ni­cal pro­ces­ses and tho­se desi­gned by humans.

Code & Sources

Exam­p­le code: OC_harmonic_oscillator_1.py , OC_harmonic_oscillator_2.py  in our tuto­ri­al­fol­der.

[1] Dym, C. L. (2002). Sta­bi­li­ty Theo­ry and Its Appli­ca­ti­ons to Struc­tu­ral Mecha­nics. New York: Dover Publications.

[2] Bal­a­krish­n­an, V.,  & Van­den­berg­he, L. (2003). Semi­de­fi­ni­te pro­gramming dua­li­ty and line­ar time-inva­ri­ant sys­tems. IEEE Tran­sac­tions on Auto­ma­tic Con­trol, 48,(1),  30–41.

[3] Ander­son, B. D. O., & Moo­re, J. B. (2007). Opti­mal Con­trol: Line­ar Qua­dra­tic Methods. New York: Cou­rier Corporation.

[4]  Yao, D. D., Zhang, S., & Zhou, X. Y. (2001). Sto­cha­stic Line­ar-Qua­dra­tic Con­trol via Semi­de­fi­ni­te pro­gramming. SIAM Jour­nal on Con­trol and Optimi­zation, 40, (3), 801–823.