1

I have a column here that looks like this and is part of a dataframe:

df.Days_Since_Earnings
Out[5]: 
0      21.0
2       1.0
4    1000.0
5     500.0
6     119.0
Name: Days_Since_Earnings, Length: 76, dtype: float64

I want to leave it as it is except I want to turn numbers above 120 to 'nan's, so it would look like this:

df.Days_Since_Earnings
Out[5]: 
0      21.0
2       1.0
4       nan
5       nan
6     119.0
Name: Days_Since_Earnings, Length: 76, dtype: float64

thanks to anyone who helps!

orie
  • 541
  • 6
  • 20
  • 2
    `df[df.Days_Since_Earnings.gt(120)] = np.nan` – yatu May 21 '20 at 13:28
  • Does this answer your question? [How to set a cell to NaN in a pandas dataframe](https://stackoverflow.com/questions/34794067/how-to-set-a-cell-to-nan-in-a-pandas-dataframe) – deadshot May 21 '20 at 13:30
  • `idx_days_over_120 = df["Days_Since_Earnings"] > 120` and then `df.loc[idx_days_over_120, "Days_Since_Earnings"] = np.nan` – Dan May 21 '20 at 13:49
  • this was helpful, thanks – orie May 21 '20 at 14:49

2 Answers2

2

You can use mask:

df['Days_Since_Earnings'] = df.Days_Since_Earnings.mask(df.Days_Since_Earnings > 120)

or where with reverse condition

df['Days_Since_Earnings'] = df.Days_Since_Earnings.where(df.Days_Since_Earnings <= 120)

or loc assignment:

df.loc[df.Days_Since_Earnings > 120, 'Days_Since_Earnings'] = np.nan
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
1
df['days'] = df['days'].apply(lambda x: np.nan if x > 120 else x)
print(df)

Or

df[df['days'] > 120] = np.nan

    days
0   21.0
1    1.0
2    NaN
3    NaN
4  119.0
NYC Coder
  • 7,424
  • 2
  • 11
  • 24
  • 1
    Using `apply` for this is likely to be very inefficient: https://stackoverflow.com/a/55557758/1011724 – Dan May 21 '20 at 13:32