2

I was trying to use pandas.where() to remove negative values from a column in a data frame.

The most obvious way to remove the negative values is to simply run pandas.abs() on the column. So:

import pandas as pd

frame = pd.DataFrame([-1,-1,-3,-4,-5],columns=["amount"])

frame.amount = frame.amount.abs()

But I wanted to try the same thing using pandas.where(). So I tried the following:

frame.amount = frame["amount"].where(frame["amount"] < 0, frame["amount"].abs(), inplace=True)

Which returns:

Python 3.6.1 (default, Dec 2015, 13:05:11)
[GCC 4.8.2] on linux
  amount
0   None
1   None
2   None
3   None
4   None

Two things confused me:

  • I was surprised I had to assign each expression (frame.amount = ...) because I thought calling the operation in either case would mutate the Dataframe (isn't that what 'inplace' should do?) and
  • why does pandas.where() return 'None'
MikeB2019x
  • 823
  • 8
  • 23
  • Use `frame.amount = frame["amount"].where(frame["amount"] > 0, frame["amount"].abs())` – jezrael May 27 '19 at 13:59
  • 1
    Just change the `<` to `>` and remove the `inplace` parameter. Note that the `DataFrame.where` documentation states that `Replace values where the condition is **False**.` – Mohit Motwani May 27 '19 at 14:01
  • 2
    Not sure why they differed from the `np.where` logic. This is kind of counterintuitive + the docs on the `.where` method is not very explicit. – Erfan May 27 '19 at 14:04

1 Answers1

4

Try with:

frame["amount"].where(~(frame["amount"] < 0), frame["amount"].abs(), inplace=True)
#frame["amount"].mask(frame["amount"] < 0, frame["amount"].abs(), inplace=True)
print(frame)

series.where() says:

Where cond is True, keep the original value. Where False, replace with corresponding value from other.

So the condition has to be false for assigning. Similar is series.mask() which says:

Replace values where the condition is True.

So we can use the same as well.

Regarding inplace=True, when you use this, there is no need to assign the results back as inplace=True does the operation inplace which is as good as not using inplace and assigning the results back. inplace=True returns None which if you assign back to the series, you will be left will None

anky
  • 74,114
  • 11
  • 41
  • 70