59

In pandas, I'd like to create a computed column that's a boolean operation on two other columns.

In pandas, it's easy to add together two numerical columns. I'd like to do something similar with logical operator AND. Here's my first try:

In [1]: d = pandas.DataFrame([{'foo':True, 'bar':True}, {'foo':True, 'bar':False}, {'foo':False, 'bar':False}])

In [2]: d
Out[2]: 
     bar    foo
0   True   True
1  False   True
2  False  False

In [3]: d.bar and d.foo   ## can't
...
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

So I guess logical operators don't work quite the same way as numeric operators in pandas. I tried doing what the error message suggests and using bool():

In [258]: d.bar.bool() and d.foo.bool()  ## spoiler: this doesn't work either
...
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I found a way that works by casting the boolean columns to int, adding them together and evaluating as a boolean.

In [4]: (d.bar.apply(int) + d.foo.apply(int)) > 0  ## Logical OR
Out[4]: 
0     True
1     True
2    False
dtype: bool

In [5]: (d.bar.apply(int) + d.foo.apply(int)) > 1  ## Logical AND
Out[5]: 
0     True
1    False
2    False
dtype: bool

This is convoluted. Is there a better way?

dinosaur
  • 3,164
  • 4
  • 28
  • 40

3 Answers3

84

Yes there is a better way! Just use the & element-wise logical and operator:

d.bar & d.foo

0     True
1    False
2    False
dtype: bool
Kirell
  • 9,228
  • 4
  • 46
  • 61
  • 9
    @dinosaur Yes, there are examples of using `&` and `|` in [the boolean indexing section](http://pandas.pydata.org/pandas-docs/stable/indexing.html#boolean-indexing) – joelostblom May 31 '17 at 12:04
  • Beware this works strictly with boolean arrays. When dealing with other types like `Int32`, you get an error. – Ronan Paixão Apr 04 '22 at 18:30
  • 1
    and warning, `&` has precedence over other operation so in case on more complex operations, don't forget the parenthesis (as me...). Eg : (df.foo==0) & (df.bar==0) – PiWi Oct 22 '22 at 00:41
6

Also, there exists another one you could just multiply for AND or add for OR. Without the conversion and extra comparison as you had done.

AND operation:

d.foo * d.bar

OR operation:

d.foo + d.bar 
Mbuthia
  • 61
  • 1
  • 2
1
d[(d['bar']) & (d['foo'])] 
gre_gor
  • 6,669
  • 9
  • 47
  • 52
  • 3
    While this code may solve the question, [including an explanation](//meta.stackexchange.com/q/114762) of how and why this solves the problem would really help to improve the quality of your post, and probably result in more up-votes. Remember that you are answering the question for readers in the future, not just the person asking now. Please [edit] your answer to add explanations and give an indication of what limitations and assumptions apply. – gre_gor Feb 24 '22 at 08:13
  • @gre_gor the answer is pretty self explanatory. I liked it – Jasmeet Singh Nov 25 '22 at 11:29