# Sliced Wasserstein Distance

The sliced Wasserstein distance $SW_{2}$ is an alternative distance between probability measures which enjoys many of the same properties as the Wasserstein distance. For further reading see Santambrogio (pg. 214-215)  and Peyré & Cuturi (pg. 166-169).

## Motivation

One situation in which the Wasserstein distance is easier to compute is the 1D case. In particular, if the the measures are of the form $\alpha ={\tfrac {1}{n}}\textstyle \sum _{i=1}^{n}\delta _{x_{i}}$ and $\beta ={\tfrac {1}{n}}\textstyle \sum _{i=1}^{n}\delta _{y_{i}}$ where $x_{1}\leq \ldots \leq x_{n}$ and $y_{1}\leq \ldots \leq y_{n}$ then the Wasserstein distance is given by $W_{p}(\alpha ,\beta )^{p}={\tfrac {1}{n}}\textstyle \sum _{i=1}^{n}|x_{i}-y_{i}|^{p}$ (Peyré & Cuturi pg. 30 ). The simplicity of the 1D case provokes one to consider whether a Wasserstein-like distance over $\mathbb {R} ^{d}$ could be built from knowledge of the Wasserstein distance along projections onto 1D axes. The sliced Wasserstein distance provides an affirmative answer.

## Definition

Let $P_{\theta }:\mathbb {R} ^{d}\to \mathbb {R}$ be the projection onto a unit vector $\theta \in \mathbb {S} ^{d-1}$ i.e. $P_{\theta }(x)=x\cdot \theta$ . The sliced Wasserstein distance $SW_{2}$ on ${\mathcal {P}}_{2}(\mathbb {R} ^{d})$ is given by

$SW_{2}(\mu ,\nu )=\left(\int _{\mathbb {S} ^{d-1}}W_{2}(P_{\theta \#}\mu ,P_{\theta \#}\nu )^{2}d\theta \right)^{1/2}$ Here the integral over $\theta$ is with respect to the surface measure on $\mathbb {S} ^{d-1}$ .

## Properties

The sliced Wasserstein distance satisfies all the axioms of a true metric on ${\mathcal {P}}_{2}(\mathbb {R} ^{d})$ . The triangle inequality is inherited from $W_{2}$ and $L^{2}$ , and the positivity and symmetry of $W_{2}$ yields the positivity and symmetry of $SW_{2}$ . The tricky part lies in showing that $SW_{2}(\mu ,\nu )=0$ implies $\mu =\nu$ . Note that if $SW_{2}(\mu ,\nu )=0$ then $P_{\theta \#}\mu =P_{\theta \#}\nu$ . One can go from that observation to the conclusion that $\mu =\nu$ by appealing to the theory of Radon transforms.

It turns out that $W_{2}(P_{\theta \#}\mu ,P_{\theta \#}\nu )\leq W_{2}(\mu ,\nu )$ (i.e. $P_{\theta \#}$ is 1-Lipschitz). This implies that $SW_{2}(\mu ,\nu )\leq W_{2}(\mu ,\nu )$ which means that the identity map on ${\mathcal {P}}_{2}(\mathbb {R} ^{d})$ is $W_{2}$ -to-$SW_{2}$ -continuous. Moreover, if we restrict our domain to a compact $\Omega \subseteq \mathbb {R} ^{d}$ we have that $({\mathcal {P}}_{2}(\Omega ),W_{2})$ is itself compact and so the identity map is now a continuous bijection from a compact space to a Hausdorff space and so it must be a homeomorphism. This shows that on compact domains $SW_{2}$ is just as good as $W_{2}$ from a topological standpoint.

## Computation

To estimate the computation involved in $SW_{2}$ , one can discretize the sphere and carry out the requisite 1D Wasserstein distance computations. As mentioned in the motivation section, 1D Wasserstein distances are significantly simpler to compute. This is especially so in the case of empirical measures of equally sized support. For further details on how to compute $SW_{2}$ , see Peyré & Cuturi (pg. 166-169).