0

I have a dataset with the column RH_1:

    RH_1
   --------
    36.999
    34.555
    36.777
    33.688
    38.999
    37.667
    ...

I want to replace a set of numbers falling within a certain range with NaN. For ex. I want all the values in that column within the range 36-37 to show NaN.

So my preferred output will be like:

    RH_1
   --------   
    NaN
    34.555
    NaN
    33.688
    38.999
    37.667

So I was using this code:

train['RH_1']=train['RH_1'].apply(lambda x: np.NaN if(x in range(36,37)) else x)

But when I do train.isnull().sum() it still shows there are no null values in that column and also I don't get any error for executing that code.

P.S. I prefer it to be done using np.where() under lambda function, since I'm practising that. Alternative solutions are also requested, if any, using a simpler method.

P.P.S. I checked out this answer, however it is replacing by certain values and also not selecting a in range(..).

Debadri Dutta
  • 1,183
  • 1
  • 13
  • 39
  • That question, takes in the whole dataframe into account and not a particular column. I've tried to implement for the particular column, but didn't workout – Debadri Dutta Aug 14 '18 at 11:41

1 Answers1

1

Use between for boolean mask with Series.mask or numpy.where:

train['RH_1'] = train['RH_1'].mask(train['RH_1'].between(36,37))

Or:

train['RH_1'] = np.where(train['RH_1'].between(36,37), np.nan, train['RH_1'])
print (train)
     RH_1
0     NaN
1  34.555
2     NaN
3  33.688
4  38.999
5  37.667
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252