Linear methods
Non-linear mappingsPerhaps the principal method amongst those that provide a mapping from the high dimensional space to the embedded space is kernel PCA1. This method provides a non-linear principal components analysis (PCA) by applying the kernel trick. Kernel PCA first (implicitly) construct a higher dimensional space, in which there are a large number of linear relations between the dimensions. Subsequently, the low-dimensional data representation is obtained by applying traditional PCA. Principal curves and manifolds give the natural geometric framework for nonlinear dimensionality reduction and extend the geometric interpretation of PCA by explicitly constructing an embedded manifold, and by encoding using standard geometric projection onto the manifold. How to define the "simplicity" of the manifold is problem-dependent, however, it is commonly measured by the intrinsic dimensionality and/or the smoothness of the manifold.2 Gaussian process latent variable models (GPLVM)3 are a probabilistic non-linear PCA. Like kernel PCA they use a kernel function to form the mapping (in the form of a Gaussian process). However in the GPLVM the mapping is from the embedded space to the data space (like density networks and GTM) whereas in kernel PCA it is in the opposite direction. Other nonlinear techniques include techniques for locally linear embedding (such as Locally Linear Embedding (LLE)4, Hessian LLE, Laplacian Eigenmaps, and LTSA5). These techniques construct a low-dimensional data representation using a cost function that retains local properties of the data (actually these techniques can be viewed upon as defining a graph-based kernel for Kernel PCA). In this way, the techniques are capable of unfolding datasets such as the Swiss roll. Techniques that employ neighborhood graphs in order to retain global properties of the data include Isomap6 and Maximum Variance Unfolding. A completely different approach to nonlinear dimensionality reduction is through the use of autoencoders, a special kind of feed-forward neural networks. Although the idea of autoencoders is quite old, training of the encoders has only recently become possible through the use of Restricted Boltzmann machines. Related to autoencoders is the NeuroScale algorithm, which uses stress functions inspired by multidimensional scaling and Sammon mappings (see below) to learn a non-linear mapping from the high dimensional to the embedded space. The mappings in NeuroScale are based on radial basis function networks. Kohonen maps (also called Self-Organizing Maps or SOM) and its probabilistic variant generative topographic mapping (GTM) use a point representation in the embedded space to form a latent variable model which is based around a non-linear mapping from the embedded space to the high dimensional space. These techniques are related to work on density networks, which also are based around the same probabilistic model. Methods based on proximity matricesA method based on proximity matrices is one where the data is presented to the algorithm in the form of a similarity matrix or a distance matrix. These methods all fall under the broader class of metric multidimensional scaling. The variations tend to be differences in how the proximity data is computed; for example, Isomap, locally linear embeddings, maximum variance unfolding, and Sammon mapping (which is not in fact a mapping) are examples of metric multidimensional scaling methods. See also
References
External links
| |