The Isomap Algorithm and Graphical Structures Isomap, presented by Tenenbaum et al. (1), is a well-known algorithm for nonlinear dimensionality reduction. The algorithm has been extensively applied in various areas over the past five years. Recently, however, we find that the Isomap algorithm is topologically unstable for peripherally symmetric graphs (2), or such graphs embedded on surfaces (3), yielding random embeddings. We illustrate two examples in Fig.1, two distinct peripherally symmetric graphs drawn on the S-shaped manifold (4) and the double helicoid, respectively. Two manifolds considered here are isometric to a square sheet on the 2D plane. In other words, they can be well-shaped by bending a square sheet in their own distinct fashions. So, it is straightforward to see that the squares are the peripheral shapes of faithful embeddings (5) of such peripherally symmetric graphs. Apparently, the embeddings derived by the Isomap algorithm are not faithful, exhibiting randomness. S-shaped Manifold
Double Helicoid
Fig. 1. S-shaped Manifold. (A) The S-shaped manifold and the embedded peripherally symmetric graph.
Here, the lengths of four edges of the S-shaped manifold are mutually equal. (B) The peripherally symmetric graph. (C) The faithful embedding of the peripherally symmetric graph on the 2D plane. (D) The embedding yielded by Isomap. Double Helicoid. The meaning of each panel for the double helicoid is equivalent to the corresponding one for the S-shaped manifold.
As a matter of fact, the theoretic framework of Isomap is based on that of the canonical MDS algorithm (6) which is widely explored in various scientific fields (7-14). The manner of attaining low-dimensional embeddings is the same for both algorithms. Therefore MDS is incapable of giving the faithful drawings of peripherally symmetric graphs either. We give two illustrations in Fig. S1. The curse of failure stems from the special structure of the distance matrix formed by peripherally symmetric graph. It is not hard to know that, for cycle in question, the distance matrix D is a ( J − I ) , where J is the all-one matrix, I the identity matrix, and a is a positive constant. Action on the distance matrix D by the centering matrix H = ( I − 1c J ) gives the resulting matrix τ ( D) = − 12 HDH = a2 H , where c is the number of vertices on cycle. The eigenvectors of H can not be stably determined, leading to the randomness of embeddings yielded by the Isomap and MDS algorithms. It suffices to note that such peripherally symmetric structures extensively exist in nature, such as the double helix of DNA (15), various molecular structures (16), and the underlying structures of data in everyday research. Caution therefore must be exercised when exploiting the Isomap and MDS algorithms in scientific research. Actually, one may resort to some more robust algorithms, Laplacian Eigenmaps (17) for example, when such cases occur.
References and Notes
1.
J. B. Tenenbaum, V. de Silva, J.C. Langford, Science 290, 2319 (2000).
2.
In this comment, graphs are referred to undirected and weighted ones. Given a set of points v0 , v1 ,… , vc with certain constraints in Euclidean space, the peripherally symmetric graph is constructed by the following way:
⎧a cycle: v1 v2 ,… , vc , ⎨ ⎩ a star: v0 vi , i = 1,… , c where ‘ ’ denotes that the points in two sides are adjacent and at the same time the distances are equal between them. Clearly, the peripherally symmetric graphs are isomorphic to a regular wheel on the 2D plane. 3.
S. Lando, A. Zvonkin, Graphs on Surfaces and Their Applications (Springer, Berlin, 2004).
4.
L. K. Saul, S. T. Roweis, Machine Learning Research 4, 119 (2004).
5.
The ‘embedding’ means a representation of original graph in lower dimensional Euclidean space. One can consult (18) for more detail.
6.
T. Cox, M. Cox, Multidimensional Scaling (Chapman & Hall, London, 1994).
7.
J. A. Cameron, Physical Review 47, 2517 (1993).
8.
G. E. Sims, I. G. Choi, S.H. Kim, PNAS 102, 618 (2005).
9.
B. Weigelt et al., PNAS 100, 15901 (2003).
10. J. Kuriakose et al., Chemical Physics 120, 5433 (2004). 11. P. Piraino, E. Parente, P. L. H. Mcsweeney, Agricultural and Food Chemistry 52, 6904 (2004). 12. D. K. Agrafiotis, H. F. Xu, Chemical Information and Computer Science 43, 475 (2003). 13. E. Neophytou, C. M. Molinero, Business Finance and Accounting 31, 677 (2004).
14. D. Bimler, J. Kirkland, Psychology 38, 349 (1997). 15. J. D. Watson, F. H. C. Crick, Nature 171, 737 (1953) 16. J. A. Suchocki, Conceptual Chemistry: Understanding Our World of Atoms and Molecules (Benjamin Cummings, San Francisco, 2003). 17. M. Belkin, P. Niyogi, Neural Computation 15, 1373 (2003) 18. C. Godsil, G. Royle, Algebraic Graph Theory (Springer, New York, 2001).