2

I'm trying to replace values in a column to NaN. I normally use

imputed_data_x = imputed_data_x.replace(0, np.nan)

But my problem is that my values are not exactly 0, some are 0.01111,etc. How can I replace all values in a data frame that is less than 1?

I tried imputed_data_x = imputed_data_x.replace(>1, np.nan)

But it didn't work. I'm curious to see if I can use replace to do this or do I need a different command for conditions?

Lostsoul
  • 25,013
  • 48
  • 144
  • 239
  • 1
    np.where() may help here – Chris Jul 08 '20 at 18:07
  • 1
    `imputed_data_x.mask(imputed_data_x.lt(1))` ? – anky Jul 08 '20 at 18:08
  • 1
    Does this answer your question? [How to select rows from a DataFrame based on column values?](https://stackoverflow.com/questions/17071871/how-to-select-rows-from-a-dataframe-based-on-column-values) – Dan Jul 09 '20 at 09:58

2 Answers2

4

Use standard boolean indexing:

imputed_data_x[imputed_data_x < 1] = np.nan
Dan
  • 45,079
  • 17
  • 88
  • 157
0

DataFrame.replace is just for replacing fixed values. In your case you want to replace if the value is "close" to 0 which you can express as a predicate function. The API command to replace a value where the predicate returns false (keep the value where it is true) is

imputed_data_x = imputed_data_x.where(lambda x: x >= 1, np.nan)
maow
  • 2,712
  • 1
  • 11
  • 25