Locally linear embedding is a method for computing low-dimensional embeddings of data distributed as nonlinear manifolds in the higher-dimension space
Edit me

What is locally-linear embedding (LLE)?

Locally-linear embedding is a method for computing low-dimensional embeddings from high-dimensional data. It is especially useful when data are distributed along nonlinear manifolds in the original high-dimensional space. Other common methods for data reduction find projections of the data onto linear components. For instance, each component of a principal component analysis corresponds to a linear plane in the original space, with the different components all orthogonal to one another. If the distribution of data is “curved” in the original space, linear embeddings can provide poor low-dimensional descriptions of the structure of the data.

Locally linear embeddings overcome this limitation by finding low-dimensional descriptions that preserve local similarity relations amongst data-points. That is, the linear distances from a given data point to its nearest neighbors will be similar in the embedding and the original data, but distances between more distal datapoints will be unrelated in the embedding and original space. By preserving the local similarity structure (ie distances to nearest neighbors only), the embedding can reveal structure in data that lies along nonlinear manifolds in the higher-dimensional space. Other embedding methods that likewise preserve local structure include tSNE and k-nearest-neighbor approaches.

In behavioral sciences, nonlinear data reduction methods are useful because, empirically, many naturally-occurring high-dimensional datasets appear to exhibit nonlinear structure. The perceptual structure of hand-written digits provides a canonical example; others might include understanding how information is represented in neural-network models, finding multivariate structure in functional brain images, characterizing patterns of movement in human kinematics or robotics, or decoding semantic structure in large corpora of natural text.

Useful videos

  • Video 1{target=”_blank”} explains what the Locally linear embedding method is, elaborating on how LLE preserves the total geometry of the data while reducing the dimension.

  • Video 2{target=”_blank”} provides a short description of locally-linear embedding, goes through some Python code to explain each parameter of the scikit_learn code for LLE.

Applied papers

Theory papers

Online tutorials

  • Online tutorial 1 provides Python code for computing LLEs on the MNIST dataset of handwritten digits.

  • Blog post with linked Github repo. This post compares LLE with another non-linear embedding method (isomap), applied to the MNIST dataset of hand-written digits. In addition to providing a concise overview of the central ideas, it links to a GitHub repo for experimenting with these techniques.