Principal Component Analysis#

Use principal components analysis (PCA) to decompose a multivariate data set into orthogonal components that explain a maximum amount of variance.

Creates new “channels” named {name}_1 ... {name}_n, where name is the Name attribute and n is Num components.

The same decomposition may not be appropriate for different subsets of the data set. If this is the case, you can use the By attribute to specify metadata by which to aggregate the data before estimating (and applying) a model. The PCA parameters such as the number of components and the kernel are the same across each subset, though.

Name: The operation name; determines the name of the new columns.

Channels: The channels to apply the decomposition to.

Scale: Re-scale the data in the specified channels before fitting.

Num components: How many components to fit to the data? Must be a positive integer.

By: A list of metadata attributes to aggregate the data before estimating the model. For example, if the experiment has two pieces of metadata, Time and Dox, setting By to ["Time", "Dox"] will fit the model separately to each subset of the data with a unique combination of Time and Dox.

Whiten: Scale each component to unit variance? May be useful if you will be using unsupervized clustering (such as K-means).