PCA is a method of data reduction that plots data along dimensions that explain the greatest variance in the data
Edit me
Framing
Principal component analysis (PCA) is a technique for reducing the dimensionality of datasets with multiple variables. The main goal of PCA is to transform large sets of variables into smaller sets of “principal components” while preserving as much information as possible. PCA produces as many principal components as there are variables. Only principal components that explain the most variance in the data set are retained, while those with low explanatory value are discarded.
Recommended Path for Learning
Videos
- This video introduction explains the essential purpose of and basic methods for conducting principal component analysis with easy to follow examples.
Web Article
- This article explains, in plain language, what PCA is and how it works. The author also provides useful, but simple, guidelines for when PCA is an appropriate choice for the data analyst. A comprehensive set of PCA resources, curated by the author, is included at the end of the article.
Code
- R tutorial on how to perform a Principal Component Analysis (PCA) using the built-in R functions prcomp() and princomp().
Journal Article
- This article introduces the basic ideas of PCA including its uses and limitations. It also describes some variants of PCA and their application. This article will be most accessible for someone who is already somewhat familiar with PCA and has experience with linear algebra.