1

running

imputed_training=impyute.imputation.cs.em(X_train2.values, loops=50)
xtrain2_imputed=pd.DataFrame(imputed_training)
columns=('interest-over-time','hash-rate',...) # very long list
xtrain2_imputed.columns = columns

Returns a dataframe containing completely different values from the original dataframe (xtrain2). How can I impute my NaNs using expectation maximization in a way that returns a dataframe with the same columns, column order and row order as my original df?

1 Answers1

0

When you do this you can assign it back

mputed_training=impyute.imputation.cs.em(X_train2.values, loops=50)
X_train2[:]= mputed_training
BENY
  • 317,841
  • 20
  • 164
  • 234
  • This works but I get the following warning: C:\Users\Admin\anaconda3\lib\site-packages\ipykernel_launcher.py:19: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy – Sahelanthropus Jun 25 '20 at 01:53
  • @Sahelanthropus when you get your X_train2, adding the .copy https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas – BENY Jun 25 '20 at 01:57