Jonas Nordhaug Myhre*, Michael Kampffmeyer, Robert Jenssen [email protected] Machine Learning @ UiT, Institute of Physics and Technology, UiT - The Arctic University of Norway

x

1. Introduction

x

x

xx

x

x

x

x

x x x

x x

xx x x x x x x x x x x x xx xxx x xx x x xx x xx x x x xx x x x x xxx xxx x xxx x x x x xxx x x x x x xx xx x xx x x xx x x x xxxx x x x x x xx x x x xx x x x x x x x x xx xx x xx x xx x x x x x x x x x x x x x x x xx x x x x x x xx x x x x x x xx x x x

x xx x xx x x xxx x x x x x

o

φ:R

In this work we present a novel workflow for manifold learning using density ridge estimation (Ozertem & Erdogmus, 2011).

Our workflow exploits established dimensionality reduction techniques that preserve variance (PCA) or entropy (KECA) (Jenssen, 2010) to capture the lower dimensional subspace (linear in the case of PCA). We then estimate the manifold directly in low dimension using density ridge estimation. Finally, to make the workflow complete we train a neural network to learn the inverse mapping from the low dimensional space to the original ambient space after the manifold has been estimated.

→R

d

x x

x x

x x

x

Density ridge estimation is a recent, theoretically established method for estimating principal manifolds (Genovese et al., 2014). This estimator relies on the gradient and Hessian of a kernel density estimate, which are both prone to the curse of dimensionality. To introduce density ridges as a practical manifold learning tool, we propose to split the manifold estimation problem into separate stages: dimensionality reduction and manifold estimation. This represents a shift in strategy: historically the problem of manifold estimation has been posed as learning an ‘embedding’ or ‘unfolding’ of the intrinsic manifold. Such embeddings consists in most cases of a combination of estimating the manifold and reducing the ambient dimension.

D

x x

x

ox

o

φ−1 : Rd → RD

o

o

xt

o

xs

ˆ M

Figure 1. Summary of workflow: φ is either PCA or KECA, ˆ is the φ−1 is the backprojecting neural network, and M density ridge estimate of the manifold.

density ridge estimate, interpolate between two arbitrary images (along the geodesic), and finally we map the results back to the original pixel space using the neural net. 40

20

0

-20

-40 40

20

0

-20

-40

18

20

22

24

26

Figure 2. Results of workflow using KECA dimensionality reduction on the Frey Face dataset. Left: The density ridge estimate. Right: Interpolation along geodesic (red line).

This yields efficient manifold estimation in low dimension, and makes available theoretical concepts from differential geometry such as geodesics, parallel transport, exponential maps etc. 2

Figure 1 shows a conceptual sketch of the suggested workflow.

0 -2 -4 -6 4 -2

2 0

0 -2

2 -4

2. Experiments: Faces and Digits Two experiments where we perform image interpolation using our workflow are included. We compute the

Figure 3. Results of workflow using PCA on the ones of the MNIST dataset. Left: The density ridge estimate. Right: Interpolation along geodesic (red line).

Ambient space manifold learning

References Doll´ ar, Piotr, Rabaud, Vincent, and Belongie, Serge. Non-isometric manifold learning: Analysis and an algorithm. In Proceedings of the 24th international conference on Machine learning, pp. 241–248. ACM, 2007. Genovese, Christopher R, Perone-Pacifico, Marco, Verdinelli, Isabella, Wasserman, Larry, et al. Nonparametric ridge estimation. The Annals of Statistics, 42(4):1511–1545, 2014. Hauberg, Søren, Freifeld, Oren, and Black, Michael J. A geometric take on metric learning. In Advances in Neural Information Processing Systems, pp. 2024– 2032, 2012. Jenssen, Robert. Kernel entropy component analysis. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 32(5):847–860, 2010. Myhre, Jonas Nordhaug, Shaker, Matineh, Kaba, Devrim, Jenssen, Robert, and Erdogmus, Deniz. Manifold unwrapping using density ridges. arXiv preprint arXiv:1604.01602, 2016. Ozertem, Umut and Erdogmus, Deniz. Locally defined principal curves and surfaces. The Journal of Machine Learning Research, 12:1249–1286, 2011. Shaker, Matineh, Myhre, Jonas N, Kaba, M Devrim, and Erdogmus, Deniz. Invertible nonlinear cluster unwrapping. In Machine Learning for Signal Processing (MLSP), 2014 IEEE International Workshop on, pp. 1–6. IEEE, 2014.