# Gradient flows in Hilbert spaces

Gradient Flows in Hilbert Spaces are generalizations of time-derivatives with a gradient constraint. Specifically, a gradient flow is a Hilbert Space valued function who's time derivative lies in some generalized collection of gradient vectors. Gradient flows are a key topic in the study of non-linear time evolution partial differential equations. In this exposition, we will draw from Ambrosio et al.'s resource Lectures on Optimal Transport and Evans' Partial Differential Equations.

## Introduction

The heat equation is a classic example of a time evolution partial differential equation. In particular, the heat equation is a linear parabolic partial differential equation. Such PDEs are well understood and are solvable using several different approaches. One particularly interesting technique is to view the PDE as a Banach-space valued ODE in the time variable. In this case, we can try to understand how to write the solution of the PDE as a flow in time which is a generalization of the exponential function. The techniques which one implements to find such a solution ultimately results in the Hille-Yosida theorem, which gives necessary and sufficient conditions for an operator ${\displaystyle T}$ to be infinitesimal generator of a contraction semigroup of the given PDE[1]. In some sense, these ideas can be extended to non-linear time evolution PDEs, leading to the general notion of flows on Hilbert spaces. We will discuss in this article how the theory of flows can be used to yield existence of a solution of a "non-linear heat equation."

## Definitions

Let ${\displaystyle H}$ be a Hilbert space with inner product ${\displaystyle \langle \cdot ,\cdot \rangle }$ with induced metric ${\displaystyle ||\cdot ||}$. Throughout this exposition, we assume that ${\displaystyle f:H\rightarrow (-\infty ,\infty ]}$ is proper, so that the domain on which it takes finite values, ${\displaystyle {\text{dom}}(f)}$, is not empty.

First, we recall the notion of the subdifferential, rewritten from Ambrosio et al.'s definition[2].

The subdifferential of ${\displaystyle f}$ at ${\displaystyle x\in {\text{dom}}(f)}$ is the collection,

${\displaystyle \partial (f(x)):=\left\lbrace v\in H:f(u)\geq f(x)+\langle v,u-x\rangle +o(||u-v||)\right\rbrace }$

Remark: observe that we are not assuming ${\displaystyle f}$ is convex, only that it is proper. In fact, Ambrosio et al. discusses the case when ${\displaystyle f}$ is ${\displaystyle \lambda }$-convex, which generalizes the notion of convexity. We have omitted that discussion for the sake of clarity and brevity. If ${\displaystyle f}$ is indeed convex, then the subdifferential becomes,

${\displaystyle \partial (f(x)):=\left\lbrace v\in H:f(u)\geq f(x)+\langle v,u-x\rangle \quad {\text{for each }}u\in H\right\rbrace }$

A gradient flow ${\displaystyle x(t):(0,\infty )\rightarrow {\text{dom}}(f)}$ is a locally absolutely continuous function with the property that ${\displaystyle x'(t)\in \partial (f(x(t))}$ for almost every ${\displaystyle t}$ (with respect to Lebesgue measure)[2]. Note that the ${\displaystyle x(t)}$ being locally absolutely continuous is necessary for the existence (almost everywhere) of ${\displaystyle x'(t)}$[2]. It will be particularly useful to identify the starting point of a gradient flow ${\displaystyle x(t)}$, which is given by ${\displaystyle x_{0}:=\lim _{t\rightarrow 0}x(t)}$.

## Main Existence Theorem

### From Linear to Nonlinear Operators

Recall that the Hille-Yosida theorem states the following:

Theorem[1] Let ${\displaystyle T}$ be a densely defined linear operator on a Banach space (note that ${\displaystyle T}$ need not be bounded, but we assume ${\displaystyle T}$ is closed). Denote ${\displaystyle \rho (T):=\left\lbrace z\in \mathbb {C} :(z-T){\text{ is invertible and bounded}}\right\rbrace }$ as the resolvent set of ${\displaystyle T}$. Then ${\displaystyle T}$ is the infinitesimal generator of a semigroup ${\displaystyle S_{t}}$ if and only if, for each ${\displaystyle \lambda >0}$, we have ${\displaystyle \lambda \in \rho (T)}$ and ${\displaystyle ||(\lambda -T)^{-1}||\leq {\frac {1}{\lambda }}}$

Using this result, we may view a linear time evolution PDE as a Banach space valued problem of the following form:

${\displaystyle \phi '(t)=T\phi }$
${\displaystyle \phi (0)=\phi _{0}}$

which has solution ${\displaystyle \phi (t):=S_{t}\phi _{0}}$[1] . In the ODE above, ${\displaystyle T}$ denotes the linear differential operator in the original linear time evolution PDE.

This approach works well when ${\displaystyle T}$ is linear, but requires some significant modification in the case that ${\displaystyle T}$ is nonlinear. In particular, observe that the Hille-Yosida theorem makes use of the resolvent of the relevant linear operator. In some sense, the resolvent bypasses the problems that arise from the unboundedness of the linear operator; in particular, it makes sense to discuss power series expansions involving the resolvent. In the unbounded case, one must introduce a generalization of the resolvent.

In the notation of Ambrosio et al., the analogue of the resolvent used in the proof of the Brézis-Komura Theorem is ${\displaystyle (I+\epsilon T)^{-1}}$ where ${\displaystyle T}$ is a non-linear operator on a Hilbert Space ${\displaystyle H}$[2]. Moreover, the existence argument in classical proof the Brézis-Komura Theorem requires a modification to the generalized resolvent of ${\displaystyle T}$. This modification is the Yosida Regularization,[2]

${\displaystyle Y(T,\epsilon ):={\frac {I-(I+\epsilon T)^{-1}}{\epsilon }}}$

where ${\displaystyle \epsilon }$ is some non-negative parameter. The Yosida Regularization, being Lipschitz as an operator on ${\displaystyle H}$ with Lipschitz constant ${\displaystyle {\frac {2}{\epsilon }}}$ [3], allows one to construct the starting point for a solution in the Brézis-Komura Theorem.[1] This is analogous to the Moreau-Yosida Regularization, where lower semicontinuous function is approximated by a Lipschitz function.

Another way to see the Yosida Regularization is via a time-discretization, the approach first used by Jordan, Kinderlehrer and Otto[4]. There are two ways of discretizing a gradient flow, namely Euler schemes, explicit one:

${\displaystyle {\frac {X_{\epsilon }^{n+1}-X_{\epsilon }^{n}}{\epsilon }}=-\nabla f(X_{\epsilon }^{n})}$,

where ${\displaystyle X_{\epsilon }^{0}=X^{0}}$, and implicit one:

${\displaystyle {\frac {X_{\epsilon }^{n+1}-X_{\epsilon }^{n}}{\epsilon }}=-\nabla f(X_{\epsilon }^{n+1})}$,

where ${\displaystyle X_{\epsilon }^{0}=X^{0}.}$

Note that for implicit scheme ${\displaystyle X_{\epsilon }^{n}=(I+\nabla f)^{-n}X_{\epsilon }^{0}.}$

Explicit one is easier for implementation, but the implicit one is more natural here since it decreases, same as ${\displaystyle f(x(t)).}$ However, using our definition of gradient flow ${\displaystyle \partial f}$, we can define Yosida Regularization of ${\displaystyle \partial f}$ with step ${\displaystyle \epsilon }$,

${\displaystyle (\partial f)_{\epsilon }=:={\frac {I-(I+\epsilon (\partial f))^{-1}}{\epsilon }}.}$

Now, implicit scheme can be interpreted as explicit scheme applied to above Yosida Regularizaton. Namely,

${\displaystyle X_{\epsilon }^{n+1}-X_{\epsilon }^{n}=(I+\partial f)^{-1}X_{\epsilon }^{n}-X_{\epsilon }^{n}=-\epsilon (\partial f)_{\epsilon }X_{\epsilon }^{n},}$

as we claimed (more details could be found here [5], [6], [2] ).

In order to formulate the Brézis-Komura Theorem, we introduce ${\displaystyle \lambda }$- convexity:

Definition[2] Given ${\displaystyle \lambda \in \mathbb {R} ,}$ we say that ${\displaystyle f:H\rightarrow (-\infty ,\infty ]}$ is ${\displaystyle \lambda }$- convex if ${\displaystyle f-{\frac {\lambda }{2}}|\cdot |^{2}}$ is convex.

### The Brézis-Komura Theorem

We restate the Brézis-Komura Theorem as is stated in Ambrosio et al.

Theorem[2] Assume that ${\displaystyle f}$ is ${\displaystyle \lambda }$-convex for some ${\displaystyle \lambda \in \mathbb {R} }$ and lower semicontinuous. For every ${\displaystyle x_{0}\in {\overline {{\text{dom}}(f)}}}$, there exists a unique gradient flow ${\displaystyle x(t):=S_{t}x_{0}}$ starting at ${\displaystyle x_{0}.}$ The family of operators ${\displaystyle \left\lbrace S_{t}\right\rbrace _{t>0},}$ satisfies the semigroup property ${\displaystyle S_{t+s}=S_{t}\circ S_{s},}$ and the contractivity property
${\displaystyle |S_{t}x_{0}-S_{t}y_{0}|\leq e^{-\lambda t}|x_{0}-y_{0}|\quad \forall x_{0},y_{0}\in {\overline {{\text{dom}}(f)}}.}$

## Example and Applications

As suggested by our previous discussion, the Brézis-Komura Theorem may be used to assert the existence of flows solving certain nonlinear time-evolution PDEs. Several nonlinear time-evolution PDEs and their solutions are discussed by both Ambrosio et al. and Evans. A simple example consists of a particular case of the so-called ${\displaystyle p}$-Laplace equation[2] on ${\displaystyle L^{2}(\mathbb {R} ^{n})}$, which seeks to find a solution to the heat-like equation ${\displaystyle u_{t}-\nabla \cdot (|\nabla u|^{2}\nabla u)=0}$. Motivated by the applicable variational formulation of this problem, one may consider the function ${\displaystyle T(u):=\int _{\mathbb {R} ^{n}}{\frac {|\nabla u|^{4}}{4}}}$ whenever ${\displaystyle u\in L^{2}(\mathbb {R} ^{n})\cap W^{1,4}(\mathbb {R} ^{n})}$, with ${\displaystyle T(u):=\infty }$ otherwise. Applying the Brézis-Komura Theorem yields a flow ${\displaystyle x(t)}$ such that ${\displaystyle x'(t)=\nabla \cdot (|\nabla x|^{2}\nabla x)}$. Some care must be taken to show that the subdifferential of ${\displaystyle T}$ coincides with the right hand side of the expression for ${\displaystyle x'(t)}$.

In general, the ${\displaystyle p}$-Laplace equation is given by ${\displaystyle u_{t}-\nabla \cdot (|\nabla u|^{p-2}\nabla u)=0}$ and is solved on ${\displaystyle L^{2}(\mathbb {R} ^{n})}$. Note that the ${\displaystyle p}$-Laplace equation is a generalization of the heat equation and we may recover the heat equation on ${\displaystyle L^{2}(\mathbb {R} ^{n})}$ when ${\displaystyle p=2}$. In that case, the Brézis-Komura Theorem may be applied to the function ${\displaystyle T(u):=\int _{\mathbb {R} ^{n}}{\frac {|\nabla u|^{2}}{2}}}$ whenever ${\displaystyle u\in H^{1}(\mathbb {R} ^{n})}$, with ${\displaystyle T(u):=\infty }$ otherwise. Thus, we acquire the existence of a gradient flow which satisfies the heat equation. Heat equation can also be seen as a gradient flow for the energy functional ${\displaystyle T(u)=\int |u|^{2},}$ where metric tensor is defined by ${\displaystyle H^{-1}}$ metric. Hence, there are many ways to interpret PDE as a gradient flow.

If we look at the similar energy functional ${\displaystyle T(\rho )=|\rho |_{H^{-1}(\Omega )}^{2}}$ for probability density function ${\displaystyle \rho ,}$ we get gradient flow ${\displaystyle {\frac {\partial \rho }{\partial t}}=-\nabla \cdot (\rho \nabla \Delta ^{-1}\rho ),}$ that is studied in Ginzburg-Landau dynamics (see [6]).

The Brézis-Komura Theorem is also used to assert the existence of the Riemannian Heat Semigroup[2]. This forms the starting point for connecting optimal transport and ricci curvature.

Denote the Boltzmann's ${\displaystyle H}$ functional ${\displaystyle T(\rho )=\int \rho \log(\rho )}$ for probability density function ${\displaystyle \rho }$. If we use the Wasserstein distance ${\displaystyle W_{2}}$, then the gradient flow satisfies the Continuity Equation ${\displaystyle {\frac {\partial \rho }{\partial t}}+\nabla \cdot (v\rho )=0,}$ where ${\displaystyle v={\frac {-\nabla \rho }{\rho }}.}$ Hence, this is also the heat equation, ${\displaystyle {\frac {\partial \rho }{\partial t}}=\Delta \rho .}$

## References

1. L Evans, Partial Differential Equations, p. 435-443, p. 562-579
2. L Ambrosio, E Brué, D Semola, Lectures on Optimal Transport, p. 109-124, p.138, p. 230
3. H.Brezis, Functional Analysis, Sobolev Spaces and Partial Differential Equations, p. 181-182
4. Jordan, R., Kinderlehrer, D., and Otto, F., The variational formulation of the Fokker-Planck equation, SIAM J.Math Anal. 29, 1 (1998)
5. F.Santambrogio, Optimal Transport for Applied Mathematicians, p.287
6. C.Villani, Topics in Optimal Transportation, p. 260-261