Fenchel-Rockafellar and Linear Programming

The Fenchel-Rockafellar Theorem is a well-known result from convex analysis that establishes a minimax principle between convex functions and their convex conjugates under some regularity conditions. One fundamental application of this theorem is the characterization of the dual problem of a finite dimensional linear program.

The Fenchel-Rockafellar Theorem

Let ${\displaystyle \phi :X\to \mathbb {R} \cup \left\{+\infty \right\}}$ and ${\displaystyle \psi :X\to \mathbb {R} \cup \left\{+\infty \right\}}$ be convex, lower semicontinuous and proper functions. Suppose that there exists ${\displaystyle x_{0}\in X}$ such that ${\displaystyle \phi (x_{0}),\varphi (x_{0})<\infty }$, where ${\displaystyle \phi }$ is continuous at ${\displaystyle x_{0}}$. Then, it holds that

${\displaystyle {\underset {x\in X}{\inf }}\left\{\phi (x)+\psi (x)\right\}={\underset {u\in X^{*}}{\max }}\left\{-\phi ^{*}(-u)-\psi ^{*}(u)\right\}.}$

Proof

We provide a sketch of the proof, the reader is referred to Brezis[1] for further reading. Note that, by Young's Inequality, for any ${\displaystyle x\in X}$ and ${\displaystyle u\in X^{*}}$ we have

${\displaystyle \phi (x)+\phi ^{*}(-u)+\psi (x)+\psi ^{*}(u)\geq \langle x,-u\rangle +\langle x,u\rangle =0.}$
Hence, the infimum of the left hand side is greater than or equal to the supremum of the right hand side. If the infimum is ${\displaystyle -\infty }$, equality must be obtained, and for every ${\displaystyle u\in X^{*}}$ the supremum is realized; otherwise assume ${\displaystyle \phi (x)+\psi (x)>-\infty }$ for all ${\displaystyle x\in X}$, and let ${\displaystyle m}$ be the value of the infimum.

Let ${\displaystyle A:=\left\{(x,t):\phi (x)\leq t\right\},}$ and observe that since ${\displaystyle \phi }$ is continuous at ${\displaystyle x_{0}}$, the interior of ${\displaystyle C}$ is nonempty. Likewise, let ${\displaystyle B:=\left\{(x,t):t\leq m-\psi (x)\right\}}$. Both sets are convex, and by construction we have ${\displaystyle B\cap {\text{int}}A=\emptyset }$. So a geometric Hahn-Banach Theorem implies that there is some nonzero ${\displaystyle (f,k)\in X^{*}\times \mathbb {R} }$ and ${\displaystyle \alpha \in \mathbb {R} }$ such that

${\displaystyle {\begin{cases}\langle x,f\rangle +kt\geq \alpha ,&(x,t)\in A,\\\langle x,f\rangle +kt\leq \alpha ,&(x,t)\in B.\end{cases}}}$
Since ${\displaystyle \phi }$ is finite at ${\displaystyle x_{0}}$, letting ${\displaystyle t\to +\infty }$ shows that ${\displaystyle k}$ is nonnegative, and if ${\displaystyle k=0}$, continuity and joint finiteness imply ${\displaystyle \lVert f\rVert =0}$, a contradiction, so ${\displaystyle k>0}$. Thus, we can use the inequalities above with the definitions of ${\displaystyle \phi ^{*}}$ and ${\displaystyle \psi ^{*}}$ to conclude that
${\displaystyle \phi ^{*}\left(-{\frac {f}{k}}\right)\leq -{\frac {\alpha }{k}},\quad {\text{and}}\quad \psi ^{*}\left({\frac {f}{k}}\right)\leq {\frac {\alpha }{k}}-m.}$
Which implies that
${\displaystyle -\phi ^{*}\left(-{\frac {f}{k}}\right)-\psi ^{*}\left({\frac {f}{k}}\right)\geq m.}$
Finally, since the supremum includes this term, it must also be greater than or equal to the infimum, which yields their equality.

Application to Linear Programs

Let ${\displaystyle A_{1}\in \mathbb {R} ^{p\times m}}$,${\displaystyle A_{2}\in \mathbb {R} ^{q\times n}}$,${\displaystyle b_{1}\in \mathbb {R} ^{p}}$, ${\displaystyle b_{2}\in \mathbb {R} ^{q}}$,${\displaystyle c_{1}\in \mathbb {R} ^{m}}$ and ${\displaystyle c_{2}\in \mathbb {R} ^{m}}$ and consider the following finite dimensional linear program[2]

{\displaystyle {\mathcal {P}}=\quad {\begin{aligned}&{{\text{ }}\inf }&&\langle c_{1},x\rangle +\langle c_{2},y\rangle \\&{\text{subject to}}&&A_{1}x\geq b_{1}\\&&&A_{2}y=b_{2}\\&&&x\geq 0\\\end{aligned}}}

where ${\displaystyle x\in \mathbb {R} ^{m}}$ and ${\displaystyle y\in \mathbb {R} ^{n}}$. Moreover, let us suppose that there exist at least one feasible solution ${\displaystyle {\tilde {x}}\in \mathbb {R} ^{m},{\tilde {y}}\in \mathbb {R} ^{n}}$ such that ${\displaystyle A_{1}{\tilde {x}}>b_{1}}$ ${\displaystyle A_{2}{\tilde {y}}=b_{2}}$ and ${\displaystyle {\tilde {x}}>0}$. Then, dual problem of ${\displaystyle {\mathcal {P}}}$ is given by

{\displaystyle {\mathcal {D}}=\quad {\begin{aligned}&{{\text{ }}\max }&&\langle b_{1},\alpha \rangle +\langle b_{2},\beta \rangle \\&{\text{subject to}}&&A_{1}^{T}\alpha \leq c_{1}\\&&&A_{2}^{T}\beta =c_{2}\\&&&\alpha \geq 0\\\end{aligned}}}

where ${\displaystyle \alpha \in \mathbb {R} ^{p}}$ and ${\displaystyle \beta \in \mathbb {R} ^{q}}$. Furthermore, ${\displaystyle {\mathcal {D}}}$ attains its optimal value and we have ${\displaystyle {\mathcal {P}}={\mathcal {D}}}$.

Proof

Note that, the linear program may be equivalently written as

${\displaystyle {\underset {x\in \mathbb {R} ^{m},y\in \mathbb {R} ^{n}}{\inf }}\left\{\langle c_{1},x\rangle +\langle c_{2},y\rangle +\mathrm {X} _{\mathbb {R} _{+}}(x)+\mathrm {X} _{\mathbb {R} _{+}}(b_{1}-A_{1}x)+\mathrm {X} _{\{0\}}(b_{2}-A_{2}y)\right\}}$

where ${\displaystyle \mathrm {X} _{A}}$ denotes the characteristic function of ${\displaystyle A\subseteq \mathbb {R} ^{n}}$. Further, let us define ${\displaystyle \phi (x,y)=\langle c_{1},x\rangle +\langle c_{2},y\rangle +\mathrm {X} _{\mathbb {R} _{+}}(x)}$, and ${\displaystyle \varphi (x,y)=\mathrm {X} _{\mathbb {R} _{+}}(b_{1}-A_{1}x)+\mathrm {X} _{\{0\}}(b_{2}-A_{2}y)}$. Note that, both ${\displaystyle \phi }$ and ${\displaystyle \varphi }$ are convex, lower semicontinuous and proper functions, where we have ${\displaystyle \phi ({\tilde {x}},{\tilde {y}}),\varphi ({\tilde {x}},{\tilde {y}})<\infty }$. Moreover, since we have ${\displaystyle {\tilde {x}}>0}$, we observe that ${\displaystyle \phi }$ is continuous at ${\displaystyle ({\tilde {x}},{\tilde {y}})}$. Therefore, by the Fenchel-Rockafellar Theorem we obtain

${\displaystyle {\underset {w\in \mathbb {R} ^{m},t\in \mathbb {R} ^{n}}{\max }}\left\{-\phi ^{\star }(-w,-t)-\varphi ^{\star }(w,t)\right\}={\underset {x\in \mathbb {R} ^{m},y\in \mathbb {R} ^{n}}{\inf }}\left\{\langle c_{1},x\rangle +\langle c_{2},y\rangle +\mathrm {X} _{\mathbb {R} _{+}}(x)+\mathrm {X} _{\mathbb {R} _{+}}(b_{1}-A_{1}x)+\mathrm {X} _{\{0\}}(b_{2}-A_{2}y)\right\}.}$

In terms of convex conjugate of ${\displaystyle \phi }$ we have

{\displaystyle {\begin{aligned}\phi ^{\star }(-w,-t)&={\underset {x\in \mathbb {R} ^{m},y\in \mathbb {R} ^{n}}{\sup }}\left\{-\langle x,w+c_{1}\rangle -\langle y,t+c_{2}\rangle -\mathrm {X} _{\mathbb {R} _{+}}(x)\right\}\\&={\underset {x\in \mathbb {R} _{-}^{m},y\in \mathbb {R} ^{n}}{\sup }}\left\{-\langle x,w+c_{1}\rangle -\langle y,t+c_{2}\rangle \right\}\\&=\mathrm {X} _{\mathbb {R} _{+}}(w+c_{1})+\mathrm {X} _{\{0\}}(t+c_{2}).\end{aligned}}}

Furthermore, for the convex conjugate of ${\displaystyle \varphi }$ we observe

{\displaystyle {\begin{aligned}\varphi ^{\star }(w,t)&={\underset {x\in \mathbb {R} ^{m},y\in \mathbb {R} ^{n}}{\sup }}\left\{\langle x,w\rangle +\langle y,t\rangle -\mathrm {X} _{\mathbb {R} _{+}}(b_{1}-A_{1}x)-\mathrm {X} _{\{0\}}(b_{2}-A_{2}y)\right\}\\&={\underset {x\in \mathbb {R} ^{m},y\in \mathbb {R} ^{n}}{\sup }}\left\{\langle x,w\rangle +\langle y,t\rangle -{\underset {\alpha \in \mathbb {R} _{+}^{p},\beta \in \mathbb {R} ^{q}}{\sup }}\left\{\langle b_{1}-A_{1}x,\alpha \rangle +\langle b_{2}-A_{2}y,\beta \rangle \right\}\right\}\\&={\underset {x\in \mathbb {R} ^{m},y\in \mathbb {R} ^{n}}{\sup }}\left\{{\underset {\alpha \in \mathbb {R} _{+}^{p},\beta \in \mathbb {R} ^{q}}{\inf }}\left\{\langle x,w\rangle +\langle y,t\rangle +\langle A_{1}x-b_{1},\alpha \rangle +\langle A_{2}y-b2,\beta \rangle \right\}\right\}\\&={\underset {x\in \mathbb {R} ^{m},y\in \mathbb {R} ^{n}}{\sup }}\left\{{\underset {\alpha \in \mathbb {R} _{+}^{p},\beta \in \mathbb {R} ^{q}}{\inf }}\left\{\langle x,w+A_{1}^{T}\alpha \rangle +\langle y,t+A_{2}^{T}\beta \rangle -\langle b_{1},\alpha \rangle -\langle b_{2},\beta \rangle \right\}\right\}\end{aligned}}}

Notice that, ${\displaystyle \varphi ^{\star }}$ is finite if and only if we have ${\displaystyle w=A_{1}^{T}\alpha }$ and ${\displaystyle t=A_{2}^{T}\beta }$ for some ${\displaystyle \alpha \in \mathbb {R} _{+}^{p},\beta \in \mathbb {R} ^{q}}$. Therefore, as ${\displaystyle \phi ^{\star }=\mathrm {X} _{\mathbb {R} _{+}}(w+c_{1})+\mathrm {X} _{\{0\}}(t+c_{2})}$, by combining these two results with the Fenchel-Rockafellar Theorem we obtain the dual linear program as follows.

${\displaystyle {\underset {\alpha \in \mathbb {R} _{+}^{p},\beta \in \mathbb {R} ^{q}}{\max }}\left\{\langle b_{1},\alpha \rangle +\langle b_{2},\beta \rangle -\mathrm {X} _{\mathbb {R} _{+}}(c_{1}-A_{1}^{T}\alpha )-\mathrm {X} _{\{0\}}(A_{2}^{T}\beta -c_{2})\right\}}$