0

I have a dataset of (1460, 76) size. It's currently in pandas Dataframe, It has all sorts of datatypes: int, float, object. I'm trying to run VIF function on this dataframe to get correlation in my variables but, it's throwing this error:

TypeError: '>=' not supported between instances of 'str' and 'int'

VIF Code:

vif = [variance_inflation_factor(df.values, i) for i in range(df.shape[1])]
print(vif)

What could be the reason, is it because I have strings in my data?

Jaskaran Singh Puri
  • 729
  • 2
  • 11
  • 37

1 Answers1

0

Sounds like some of your data is stored as strings instead of numeric data types. Try using pandas.to_numeric on your data frame.

Example applying to_numeric to an entire data frame

Chris
  • 531
  • 5
  • 11