Optimal control: Planning problems

Problem setting

A sys­tem recei­ves con­trol inputs that can be cho­sen free­ly within cer­tain limits and is sup­po­sed to be brought into a pre­de­fi­ned sta­te or fol­low a spe­ci­fic tra­jec­to­ry. Howe­ver, in this com­ple­te gene­ra­li­ty, the pro­blem is dif­fi­cult to manage.

Com­pli­ca­ting fac­tors such as non­linea­ri­ties, unknown sys­tem dyna­mics, and rand­om­ness in the inter­ac­tions bet­ween con­trol input and sys­tem respon­se can make the pro­blem com­ple­te­ly unsolvable.

In the simp­lest case of com­ple­te pre­dic­ta­bi­li­ty of the sys­tem, howe­ver, it is a sta­tic plan­ning pro­blem, and a solu­ti­on can be redu­ced to the search for a sequence of con­trol inputs. This search can be con­duc­ted with the help of dedi­ca­ted search algo­rith­ms such as the Mon­te-Car­lo Tree Search, or it can be for­mu­la­ted as an optimi­zation pro­blem in which the sys­tem dyna­mics take on the role of constraints.

Mathematical model

If the sta­te \(x_{k+1}\) at time \(k+1\) depends line­ar­ly on the pre­vious sta­te \(x_k\) and the con­trol input \(u_k\), the sys­tem dyna­mics can be writ­ten as

$$x_{k+1}=Ax_k+Bu_k $$

whe­re \(x_{k+1}, x_k \in \mathbb{R}^d, u_k\in \mathbb{R}^m\) are typi­cal­ly mul­ti­di­men­sio­nal vec­tors. \(A\) and \(B\) are sys­tem matri­ces that encode the influence of the sta­te and con­trol signal on the future sta­te. If the sys­tem is to be con­trol­led such that it fol­lows a tra­jec­to­ry \(f_1, …, f_n \in \mathbb{R}^d\) with as small a con­trol input as pos­si­ble, this can be for­mu­la­ted as the optimi­zation problem

$$ \begin{align} \min_{x_1, …, x_n, u_1, …, u_n}  ~~~& \sum_{k=1}^n \|x_k-f_k\|_1+\|u_k\| \\ \text{s.t.} ~~~&x_{k+1}=Ax_k+Bu_k~~~ k=1, …, n \\ ~~~& x_1=x_{start} \end{align}$$

whe­re \(x_1\) is set as the start­ing sta­te \(x_{start}\). The optimi­zation varia­bles are the \(n\) con­trol inputs \(u_1, …, u_n\) and the sta­tes \(x_1, …, x_n\), cons­trai­ned by the dyna­mic cons­traints. It is an optimi­zation pro­blem with \(nd+nm\) variables.

It can be refor­mu­la­ted as a line­ar pro­gram with the objec­ti­ve func­tion \(\|x‑f\|_1+\|u\|_1=\|v^{\Delta}\|_1+\|v^u\|_1\) whe­re \(x=[x_1^T, …, x_n^T]^T\) and \(u=[u_1^T, …, u_n^T]^T\) with \(v^{\Delta} \in \mathbb{R}^{nd}\) and \(v^u\in \mathbb{R}^{nm}\) being the vec­tors of the abso­lu­te values of the ele­ments of \(x‑f\) and \(u\).

$$ \begin{align} \min_{x_1, …, x_n, u_1, …, u_n, v^{\Delta}, v^u}  ~~~& \sum_{k=1}^n\sum_{l=1}^d v^{\Delta}_{kl}+ \sum_{k=1}^n \sum_{l=1}^m v^u_{kl} \\ \text{s.t.} ~~~&x_{k+1}=Ax_k+Bu_k~~~ k=1, …, n \\ ~~~& v^{\Delta} \ge x‑f ~~~ v^{\Delta}\ge -(x‑f) \\ ~~~& v^u\ge u ~~~~~~~~~~~ v^u \ge ‑u \end{align}$$

The first ine­qua­li­ty repres­ents the dyna­mics of the sys­tem while the last two pairs ensu­re the posi­ti­vi­ty of the abso­lu­te resi­du­als \(v^{\Delta}\) and \(v^u\) [1, p. 294].

Example: Motion control

A dro­ne is to be con­trol­led so that it moves from its cur­rent 3‑dimensional posi­ti­on \(pos_1=[x_{start}, y_{start}, z_{start}]\) to the tar­get posi­ti­on \(pos_n=[x_{Ziel}, y_{Ziel}, z_{Ziel}]\) within \(n\) time steps, arri­ving with a velo­ci­ty of \(v_n=[v^x_{Ziel}, v^y_{Ziel}, v^z_{Ziel}]\). The con­trol signal \(u\), which should be kept as small as pos­si­ble, is direct­ly trans­la­ted into acce­le­ra­ti­on, so that the relationship

$$x_{k+1}= \begin{bmatrix} pos_{k+1} \\ v_{k+1} \\ a_{k+1}\end{bmatrix} = \underbrace{\begin{bmatrix} I & \Delta t I & 0 \\ 0 & I & \Delta tI \\ 0 & 0 & \Delta t I\end{bmatrix}}_{A} \underbrace{\begin{bmatrix} pos_k \\ v_k \\ a_k\end{bmatrix}}_{x_k}+\underbrace{\begin{bmatrix} 0 \\ 0 \\ \Delta t I \end{bmatrix}}_{B} \underbrace{\begin{bmatrix}u_x \\ u_y\\ u_z\end{bmatrix}}_{u_k}$$

is valid. Here, \(I\) is the \(3 \times 3\) iden­ti­ty matrix, \(\Delta t\) is the dura­ti­on of one time step, \(pos\) stands for posi­ti­on, \(v\) for velo­ci­ties, and \(a\) for acce­le­ra­ti­ons. Each sta­te vec­tor thus has 9 components.

Figu­re 1: A dro­ne (a) and pos­si­ble tra­jec­to­ries lea­ding to the tar­get sta­te (b).

Solution and practical aspects

The cor­re­spon­ding optimi­zation pro­blem is

$$ \begin{align} \min_{x_1, …, x_n, u_1, …, u_n} 
~~~& \sum_{k=1}^n\|u_k\|_1 \\ \text{s.t.} ~~~&x_{k+1}=Ax_k+Bu_k~~~ k=1,
…, n \\ ~~~& pos_1=[x_{Start}, y_{Start}, z_{Start}]^T ~~~~pos_n=[x_{Ziel}, y_{Ziel}, z_{Ziel}]^T \\~~~& v_1=[v^x_{Start}, v^y_{Start}, v^z_{Start}]^T ~~~~~~~~~~~~v_n=[v^x_{Ziel}, v^y_{Ziel}, v^z_{Ziel}]^T\\ ~~~& u_{min}\le u_k \le u_{max} ~~~k=1, …,n .\end{align}$$

An illus­tra­ti­on of the pro­blem is loca­ted in the fol­lo­wing figu­re. Note that the final con­trol inputs and tra­jec­to­ry signi­fi­cant­ly depend on the start­ing con­di­ti­ons and pos­si­ble con­trol input restrictions.

Figu­re 2: Opti­mal tra­jec­to­ries under various cons­traints and the cor­re­spon­ding con­trol inputs.

The pro­blem is sol­ved using the free­ly available soft­ware CVXOPT [2] and CVXPY [3] on a stan­dard lap­top within a few mil­li­se­conds. The­re are various exten­si­ons. The objec­ti­ve func­tion typi­cal­ly con­ta­ins qua­dra­tic terms, and it is pos­si­ble to model uncer­tain­ties in sys­tem dyna­mics and the envi­ron­ment through ran­dom varia­bles. This leads to the Line­ar-Qua­dra­tic Gaus­si­an (LQG) con­trol­lers [4, pp. 164–207]. To avo­id com­plex opti­miza­ti­ons, it is often requi­red that \(u\) be direct­ly repre­sen­ta­ble as a func­tion of \(x\) with \(u=Kx\) and a con­stant matrix \(K\); see, for exam­p­le, [5, pp. 421–441].

Code & Sources

Exam­p­le­code: OC_vehicle_control_3.py  in our tuto­ri­al­fol­der.

[1] Boyd, S., & Van­den­berg­he, L. (2004). Con­vex Optimi­zation. Cam­bridge: Cam­bridge Uni­ver­si­ty Press.

[2] Ander­sen, M. S., , Dahl, J., & Van­den­berg­he, L. (2022) . CVXOPT: A Python packa­ge for con­vex optimi­zation, ver­si­on 1.3

[3] Dia­mond S., & Boyd S., (2016). CVXPY: A Python-embedded mode­ling lan­guage for con­vex optimi­zation. Jour­nal of Machi­ne Lear­ning Rese­arch, 17(83), 1–5.

[4] Ander­son, B. D. O., & Moo­re, J. B. (2007). Opti­mal Con­trol: Line­ar Qua­dra­tic Methods. New York: Cou­rier Corporation.

[5] Wol­ko­wicz, H., Sai­gal, R., & Van­den­berg­he, L. (2012). Hand­book of Semi­de­fi­ni­te Pro­gramming: Theo­ry, Algo­rith­ms, and Appli­ca­ti­ons. Ber­lin Hei­del­berg: Sprin­ger Sci­ence & Busi­ness Media.