I'm trying to run a Random Forest on a pandas dataframe. I know there are no nulls or infinities in the dataframe but continually get a ValueError when I fit the model. Presumably this is because I have flaot64 columns rather than float32; I also have a lot of columns of type bool and int. Is there a way to change all the float columns to float32?
I've tried rewriting the CSV and am relatively certain the problem isn't with that. I've never had problems running random forests on float64s before so I'm not sure what's going wrong this time.
labels = electric['electric_ratio']
electric = electric[[x for x in electric.columns if x != 'electric_ratio']]
electric_list = electric.columns
first_train, first_test, train_labels, test_labels = train_test_split(electric, labels)
rf = RandomForestRegressor(n_estimators = 1000, random_state=88)
rf_1 = rf.fit(first_train, train_labels)
I expect this to fit the model, but instead consistently get
ValueError: Input contains NaN, infinity or a value too large for dtype('float32').