13

I have an np.where problem using Pandas that is driving me crazy and I can't seem to solve through Google, the documentation, etc.

I'm hoping someone has insight. I'm sure it isn't complex.

I have a df where I'm checking the value in one column - and if that value is 'n/a' (as a string, not as in .isnull()), changing it to another value.

Full_Names_Test_2['MarketCap'] == 'n/a'

returns:

70      True
88     False
90      True
145     True
156     True
181     True
191     True
200     True
219     True
223    False
Name: MarketCap, dtype: bool

so that part works.

but this:

Full_Names_Test_2['NewColumn'] = np.where(Full_Names_Test_2['MarketCap'] == 'n/a', 7)

returns:

ValueError: either both or neither of x and y should be given

What is going on?

DSM
  • 342,061
  • 65
  • 592
  • 494
Windstorm1981
  • 2,564
  • 7
  • 29
  • 57

1 Answers1

21

You need to pass the boolean mask and the (two) values columns:

np.where(Full_Names_Test_2['MarketCap'] == 'n/a', 7)
# should be
np.where(Full_Names_Test_2['MarketCap'] == 'n/a', Full_Names_Test_2['MarketCap'], 7)

See the np.where docs.

or alternatively use the where Series method:

Full_Names_Test_2['MarketCap'].where(Full_Names_Test_2['MarketCap'] == 'n/a', 7)
Andy Hayden
  • 359,921
  • 101
  • 625
  • 535
  • I'm such an idiot. I think I didn't grasp the basic syntax of the np.where method. Now I see clearly. thanks again! – Windstorm1981 Oct 21 '15 at 18:42
  • 4
    @Windstorm1981 fwiw, I think the docs on this method & this error message could be a LOT clearer. It's not obvious (enough, IMO) that the 2nd argument is required. – szeitlin Jan 27 '16 at 12:20
  • This error also arises if you put `x` and `y` in an iterable (i.e. `[x,y]`) even though the Docstring contains `numpy.where(condition, [x, y])` – johnDanger Jun 16 '20 at 18:17