6.5. Unsupervised dimensionality reduction
If your number of features is high, it may be useful to reduce it with anunsupervised step prior to supervised steps. Many of theUnsupervised learning methods implement a transform
method thatcan be used to reduce the dimensionality. Below we discuss two specificexample of this pattern that are heavily used.
Pipelining
The unsupervised data reduction and the supervised estimator can bechained in one step. See Pipeline: chaining estimators.
6.5.1. PCA: principal component analysis
decomposition.PCA
looks for a combination of features thatcapture well the variance of the original features. See Decomposing signals in components (matrix factorization problems).
Examples
6.5.2. Random projections
The module: random_projection
provides several tools for datareduction by random projections. See the relevant section of thedocumentation: Random Projection.
Examples
6.5.3. Feature agglomeration
cluster.FeatureAgglomeration
appliesHierarchical clustering to group together features that behavesimilarly.
Examples
Feature scaling
Note that if features have very different scaling or statisticalproperties, cluster.FeatureAgglomeration
may not be able tocapture the links between related features. Using apreprocessing.StandardScaler
can be useful in these settings.