I have a DataFrame where some columns are columns are correlated and some are not. I want to display only the uncorrelated columns as output. can anyone help me out in solving this.I dont want to plot but display the uncorrelated column names.
Asked
Active
Viewed 90 times
0
-
Does this answer your question? [Plot correlation matrix using pandas](https://stackoverflow.com/questions/29432629/plot-correlation-matrix-using-pandas) – I'mahdi Oct 01 '21 at 13:35
-
I want to display the column names which are uncorrelated rather than plotting. – user17051608 Oct 01 '21 at 14:08
1 Answers
0
First of all calculate the correlation:
import pandas as pd
myDataFrame=pd.DataFrame(data)
correl=myDataFrame.corr()
Define what you mean by "uncorrelated". I will use an absolute value of 0.5 here
uncor_level=0.5
The following code will give you the names of the pairs that are uncorrelated
pairs=np.full([len(correl)**2,2],None) #define an empty array to store the results
z=0
for x in range(0,len(correl)): #loop for each row(index)
for y in range(0,len(correl)): #loop for each column
if abs(correl.iloc[x,y])<uncor_level:
pair=[correl.index[x],correl.columns[y]]
pairs[z]=pair
z=z+1

Anna Pas
- 41
- 3