5

I have a simple pandas data frame.

import pandas as pd    
x = [5, 10, 20, 30, 5, 10, 20, 30, 5, 10, 20, 30]
y = [100, 100, 200, 200, 300, 300, 400, 400, 500, 500, 600, 600]
users =['mark', 'mark', 'mark', 'rachel', 'rachel', 'rachel', 'jeff', 'jeff', 'jeff', 'lauren', 'lauren', 'lauren']

df = pd.DataFrame(dict(x=x, y=y, users=users)

I want to keep certain rows of the data frame. Let's say all "rachels" and "jeffs". I tried df.query:

df=df.query('users=="rachel"' or 'users=="jeff"')

The result is a data frame only with users=="rachel". Is there a way to combine queries?

Rachel
  • 1,937
  • 7
  • 31
  • 58
  • `df.query('(users=="rachel") or (users=="jeff")')` or even `df.query('users=="rachel" or users=="jeff"')` will do the trick. Tested with `pandas==1.2.4`. – banderlog013 Dec 24 '21 at 11:13

2 Answers2

21

The standard way would be to use the bitwise or operator |. For a clear explanation of why, I'd suggest checking out this answer. You also need to use parentheses around each condition due to Python's order of evaluation.

df[(df.users == 'rachel') | (df.users == 'jeff')]
    users   x    y
3  rachel  30  200
4  rachel   5  300
5  rachel  10  300
6    jeff  20  400
7    jeff  30  400
8    jeff   5  500

Using query, you can still just use the or operator:

df.query("users=='rachel' | users=='jeff'")
    users   x    y
3  rachel  30  200
4  rachel   5  300
5  rachel  10  300
6    jeff  20  400
7    jeff  30  400
8    jeff   5  500
Community
  • 1
  • 1
Nick Becker
  • 4,059
  • 13
  • 19
  • No worries. @EdChum's comment is also a simple solution. – Nick Becker Jan 04 '17 at 16:31
  • How would you create logic to show only results where name is either rachel or jeff, AND hometown was Chicago? So all rachels from Chicago, and all jeffs from Chicago, but not steves from chicago, or rachels from Atlanta. Could you use "users =='rachel' | users=='jeff' & hometown=='chicago'", or would the AND only apply to the jeffs, and you need to include the " & hometown=='Chicago'" to both sides of the OR? – Korzak Jan 24 '18 at 18:54
  • @Korzak I don't know if you need an answer at this time, but I assume, using () does the trick. So the statement would be "(user=='rachel' | user=='jeff') & hometown == chicago". With this the inner statement will filter the names and the outer statement only shows rachels and jeffs from chicago. – Phil Jan 20 '23 at 06:54
-1

another way is :

df=df.query('users=="rachel"').append(df.query('users=="jeff"'))
Julien Marrec
  • 11,605
  • 4
  • 46
  • 63
Mahesh
  • 145
  • 8