Theory: Duality

Definition

For any given mini­miza­ti­on pro­blem, the­re exists a cor­re­spon­ding mir­rored maxi­miza­ti­on pro­blem known as the dual pro­blem. This dual pro­blem aims to find a lower bound for the mini­mum value of the ori­gi­nal (pri­mal) pro­blem.

The solu­ti­on of the dual pro­blem pro­vi­des insights into the impact of cons­traints on the loca­ti­on of the opti­mum. This allows for sen­si­ti­vi­ty ana­ly­ses, which help ans­wer ques­ti­ons rela­ted to sys­tem designs that con­tri­bu­te to the robust­ness of the opti­mum. Dual pro­blems often resem­ble the under­ly­ing (pri­mal) pro­blem. For exam­p­le, if

$$ \begin{align}
P~~~~~~\min_x ~~~&f(x) \\
\text{s.t.} ~~~&A(x) =b \\
& G(x)\le h
\end{align}$$

is a gene­ral mini­miza­ti­on pro­blem, then the cor­re­spon­ding dual pro­blem is

$$ \begin{align}
D~~~~~~ \max_{\nu,\lambda} ~~~&\underbrace{\inf_x \left(f(x)+\nu^T(A(x)-b)+\lambda^T(G(x)-h)\right)}_{g(\nu,\lambda)} \\
\text{s.t.} ~~~&\lambda \ge 0
\end{align}$$

whe­re \(\nu, \lambda\) are the vec­tor-valued dual optimi­zation varia­bles, and \(\inf_x\) is infi­mum over \(x\), robust against limit for­ma­ti­on. More detail­ed infor­ma­ti­on can be found in [1, pp. 215–230].

Relevance

For every fea­si­ble \(x\) of the pri­mal pro­blem, the term \(f(x)+\nu^T(A(x)-b)+\lambda^T(G(x)-h)\) is less than \(f(x)\), becau­se for \(\lambda \ge 0\), it holds that \( \nu^T(A(x)-b)=0\) and \(\lambda^T(G(x)-h)\le 0\). Consequently,

$$ g(\nu,\lambda) \le \min_{x\in D} f(x) ~~~D=\{x \in \mathbb{R}^n: A(x)=b \text{ und } G(x)\le h\} $$

for all \(\lambda \ge 0\) and \(\nu \in \mathbb{R}^{m_1}\). If \( g(\nu,\lambda)\) is known, the mini­mum value of \(P\) can be easi­ly esti­ma­ted. Maxi­mi­zing over this lower bound leads to the dual pro­blem \(D\), who­se solu­ti­on \(d^*\) is the best deriva­ble lower bound for the solu­ti­on \(p^*\) of the pri­mal pro­blem. The advan­ta­ges of this for­ma­lism are threefold:

  • Some­ti­mes \(P\) is dif­fi­cult, but \(D\) is simple
  • Solu­ti­ons \( \lambda^*, \nu^*\) of \(D\) pro­vi­de insights into \(P\)
  • The dua­li­ty gap \(p^*-d^*\ge 0\) indi­ca­tes whe­ther \(P\) and \(D\) are cor­rect­ly solved

Example

This can be demons­tra­ted using the arche­typ­al pro­duc­tion exam­p­le, which was also used to illus­tra­te line­ar pro­gramming. Pro­duct quan­ti­ties \(x=[x_1,x_2]\ge 0\) should be pro­du­ced to meet the demand of \(x_1+x_2=2\) and mini­mi­ze the cos­ts \(c^Tx = c_1x_1+c_2x_2\) with \(c=[1,5]\). The cor­re­spon­ding line­ar program

$$ \begin{align} P~~~~~~\min_x ~~~&c^Tx ~~~~~~~&c=[1,5] \\ \text{s.t.} ~~~&Ax =b &b=2 ~~~A=[1,1] \\ & x\ge 0 & \end{align}$$

is sol­ved by algo­rith­ms to \(x^*=[2,0]\) with the mini­mal cos­ts \(c^Tx^*=2\). The dual program

$$ \begin{align} D~~~~~~\max_{\nu,\lambda} ~~~&-b^T\nu  \\ \text{s.t.} ~~~& A^T\nu +c -\lambda =0 \\ & \lambda \ge 0 \end{align}$$

yields \(g(\nu,\lambda)=\inf_x c^Tx+\nu^T(Ax‑b)-\lambda^Tx = \inf_x -\nu^Tb +(c^T+\nu^TA-\lambda^T)x\), whe­re the lat­ter for­mu­la­ti­on eit­her has the value \(-\nu^Tb\) (if \(c+A^T\nu — \lambda =0\)) or \(-\infty\) (in all other constellations).

Figu­re 1: The pri­mal pro­blem con­sists of two optimi­zation varia­bles and three cons­traints, while the dual pro­blem con­sists of three optimi­zation varia­bles and four cons­traints. In this pro­cess, the terms appearing in the cons­traints and the objec­ti­ve func­tion exch­an­ge their roles.

Interpretation

The dual pro­blem is also an LP and can be sol­ved to \(\nu^*=-1, \lambda^*=[0,4]\) with a maxi­mum value \(d^*=-b\nu^*=2\). Two con­clu­si­ons can be drawn from this:

  1. The dua­li­ty gap \(p^*-d^*\) is \(0\). \(P\) can­not have a solu­ti­on \(p^*\) with values less than 2 sin­ce \(d^*=2\) is a lower bound. Sin­ce \(p^*=d^*\) and the­re is only one opti­mum, it has been uni­que­ly iden­ti­fied, and both optimi­zation pro­blems have been cer­ti­fia­bly sol­ved correctly.
  2. The opti­mal para­me­ters \(\nu^*, \lambda^*\) of the dual pro­blem are \([\nu^*]=-1 \) and \([\lambda_1^*,\lambda^*_2]=[0,4]\). They are direct­ly rela­ted to the gains that would ari­se from dis­re­gar­ding the cons­traints [1, p.252].
    • The solu­ti­on of \(P\) with the chan­ged cons­traint \(x_1\ge 1 \) ins­tead of \(x_1 \ge 0\) is \(x^*=[2,0]\) with \(c^Tx^*=2\) — a chan­ge of size \(\lambda_1^*=0\) of \(p^*\).
    • The solu­ti­on of \(P\) with the chan­ged cons­traint \(x_2\ge 1 \) ins­tead of \(x_2 \ge 0\) is \(x^*=[1,1]\) with \(c^Tx^*=6\) — a chan­ge of size \(\lambda_2^*=4\) of \(p^*\).
    • The solu­ti­on of \(P\) with the chan­ged cons­traint \(x_1+x_2=1 \) ins­tead of \(x_1+x_2=2\) is \(x^*=[1,0]\) with \(c^Tx^*=1\) — a chan­ge of size \(\nu^*=-1\) of \(p^*\).

Thus, \(\nu^*, \lambda^*\) quan­ti­fy the eco­no­mic loss cau­sed by the cons­traints and at what pri­ce avo­i­ding them would be appro­pria­te­ly justified.

Application

If the pri­mal pro­gram is an opti­mal pro­duc­tion manage­ment pro­blem, the dual pro­gram is a pro­blem of opti­mal pri­cing by raw mate­ri­al sup­pli­ers [2, pp. 77–80]. Dual pro­grams enable sen­si­ti­vi­ty ana­ly­sis of the ori­gi­nal pro­blem and a bet­ter assess­ment of the cons­traints. They shift focus away from opti­mal decis­i­ons within the sys­tem and towards opti­mal sys­tem design. They repre­sent inte­res­t­ing pro­blems regard­less of their rela­ti­onship to the pri­mal program.

When \(p^*=d^*\) holds for the pair \((P,D)\) of pro­blems, we speak of strong dua­li­ty. Strong dua­li­ty is lin­ked to the ful­fill­ment of Sla­ter’s con­di­ti­ons con­cer­ning the fea­si­ble set of \(P\) and is the stan­dard case for LP, QP, QCQP, SOCP, and SDP. The rela­ti­onships bet­ween \(P\) and \(D\) are exploi­ted by pri­mal-dual solu­ti­on algo­rith­ms, which per­form optimi­zation steps for \(P\) and \(D\) syn­chro­no­us­ly and cou­pled. The­se methods are extre­me­ly popu­lar due to the cer­ti­fia­bi­li­ty of the solu­ti­on and the rapid con­ver­gence by uti­li­zing pri­mal and dual gra­di­ents [3, p. 115].

Sources

[1] Boyd, S., & Van­den­berg­he, L. (2004). Con­vex Optimi­zation. Cam­bridge: Cam­bridge Uni­ver­si­ty Press.

[2] Van­der­bei, Robert J. (2013). Line­ar Pro­gramming: Foun­da­ti­ons and Exten­si­ons. USA: Sprin­ger US.

[3] De Klerk, E. (2006). Aspects of Semi­de­fi­ni­te Pro­gramming: Inte­ri­or Point Algo­rith­ms and Sel­ec­ted Appli­ca­ti­ons. Ber­lin Hei­del­berg: Sprin­ger Sci­ence & Busi­ness Media.