-1

i have a df that i have used the heat-map to find the correlation between columns, and i want to drop those correlated columns, is there any library or any function which am unaware of to sort this problem?

import matplotlib.pyplot as plt
import seaborn as sns
fig, ax = plt.subplots(figsize=(30,30)) 
sns.heatmap(dfnew.corr(),annot = True, vmin=-1, vmax=1, center= 0, cmap= 'coolwarm',ax=ax)
desertnaut
  • 57,590
  • 26
  • 140
  • 166
Abhishek
  • 45
  • 8
  • Question has nothing to do with `machine-learning` or `random-forest` - kindly do not spam irrelevant tags (removed). – desertnaut Mar 22 '21 at 09:41

1 Answers1

0
# Create correlation matrix
corr_matrix =dfnew.corr().abs()

# Select upper triangle of correlation matrix
upper = corr_matrix.where(np.triu(np.ones(corr_matrix.shape), k=1).astype(np.bool))

# Find index of feature columns with correlation greater than 0.95
to_drop = [column for column in upper.columns if any(upper[column] > 0.90)]
Abhishek
  • 45
  • 8