How to replace a range of values of a Dataframe column with NaN using Numpy?

Question

I have a dataset with the column RH_1:

    RH_1
   --------
    36.999
    34.555
    36.777
    33.688
    38.999
    37.667
    ...

I want to replace a set of numbers falling within a certain range with NaN. For ex. I want all the values in that column within the range 36-37 to show NaN.

So my preferred output will be like:

    RH_1
   --------   
    NaN
    34.555
    NaN
    33.688
    38.999
    37.667

So I was using this code:

train['RH_1']=train['RH_1'].apply(lambda x: np.NaN if(x in range(36,37)) else x)

But when I do train.isnull().sum() it still shows there are no null values in that column and also I don't get any error for executing that code.

P.S. I prefer it to be done using np.where() under lambda function, since I'm practising that. Alternative solutions are also requested, if any, using a simpler method.

P.P.S. I checked out this answer, however it is replacing by certain values and also not selecting a in range(..).

That question, takes in the whole dataframe into account and not a particular column. I've tried to implement for the particular column, but didn't workout — Debadri Dutta, Aug 14 '18 at 11:41

jezrael · Accepted Answer · 2018-08-14T11:55:11.537

1

Use between for boolean mask with Series.mask or numpy.where:

train['RH_1'] = train['RH_1'].mask(train['RH_1'].between(36,37))

Or:

train['RH_1'] = np.where(train['RH_1'].between(36,37), np.nan, train['RH_1'])
print (train)
     RH_1
0     NaN
1  34.555
2     NaN
3  33.688
4  38.999
5  37.667

edited Aug 14 '18 at 11:55

answered Aug 14 '18 at 11:35

jezrael

822,522
95
1,334
1,252

` returned a result with an error set`. I'm getting an error. Also like how is your method filling in `NaN` instead? – Debadri Dutta Aug 14 '18 at 11:36
@DebadriDutta - Check edited answer. – jezrael Aug 14 '18 at 11:55

How to replace a range of values of a Dataframe column with NaN using Numpy?

1 Answers1