Here's the dataset which consists of X = 45 columns collected the data from bioclimate database. The multicollinearity test model -
from statsmodels.stats.outliers_influence import variance_inflation_factor
vif_data = pd.DataFrame()
vif_data["feature"] = X.columns
x columns
#Calculating VIF for each feature
vif_data["VIF"] = [variance_inflation_factor(X.values, i)
for i in range (0, len(X.columns))]
-------------------------------------------------------------
/usr/local/lib/python3.7/dist-packages/statsmodels/stats/outliers_influence.py:193:
RuntimeWarning: divide by zero encountered in double_scalars
vif = 1. / (1. - r_squared_i)
vif_data
Trial :
I've converted all variables into float and int vice-versa but still getting infinite values for all variables after performing multicollinearity test.
I didn't find any reference material to tackle this problem specially in python. Please help me out, I am using it for species distribution modelling.