When performing PCA on a dataset in Python, the explained_variance_ratio_ will show us the different variances for each feature in our dataset.
How do we know which columnn corresponds with which of the resulting variances?
Context: I'm working on a project and I need to know which components give us 90% of the variance with PCA so that we can perform stepwise feature selection later on.
from sklearn.decomposition import PCA
pcaObj = PCA(n_components=None)
X_train = pcaObj.fit_transform(X_train)
X_test = pcaObj.transform(X_test)
components_variance = pcaObj.explained_variance_ratio_
print(sum(components_variance))
print(components_variance)