i have 2D numpy array
[[1 3 4 2]
[2 4 6 4]
[-1 6 8 -2]
[6 4 2 12]]
i want to remove higly correated column, the result should be like this:
[[1 3 4 ]
[2 4 6 ]
[-1 6 8]
[6 4 2 ]]
see ? column 4 is removed because it's highly correlated to column 1
I can get correlation matrix
np.corrcoef(numpy_array)
The question is how to drop column that have high correlation?
I've searched the solution but only get solution that use Pandas dataframe. For some reason I don't want to use pandas. I want solution that only use numpy