I am reading a data frame from the azure databricks cluster and converting it into a pandas data frame. Pandas declares the datatype as object for all features instead of int64.
The only solution is to use astype and covert each column individually, but I have 122 columns...
pd_train = df_train.toPandas()
pd_test = df_test.toPandas()
pd_train.dtypes
pd_train displays the pandas dataframe for the training set pd_test displays the pandas dataframe for the testing set They are both spark dataframes