1

I have sensor data in the form of a pandas DataFrame along three columns (X,Y,Z). I thought about removing the rows where a row has in any column a z-score higher than 2.5 by doing

df = df[(np.abs(stats.zscore(df)) < 2.5).all(axis=1)]

But this discards the entire row so also for other columns which may have normal values. What should be the proper way to replace these abnormal high (or low) values by the mean or maybe a value closer to the mean?

Wouter Vandenputte
  • 1,948
  • 4
  • 26
  • 50
  • Look at this https://stackoverflow.com/questions/13851535/delete-rows-from-a-pandas-dataframe-based-on-a-conditional-expression-involving – Sheri Apr 12 '20 at 12:35

0 Answers0