3

I have a dataframe looks like below.

   text      country    language
-----------------------------------
  football     US         Eng
  baseball     JP         Jpn
  swimming     UK         Eng
  running      FR         Fra
  rugby        NZ         Eng
  Hockey       NL         Dut

In python, I want to extract rows which contain strings 'ball' and 'ing' in the column 'text' and make a new dataframe with those rows like below.

   text      country    language
-----------------------------------
  football     US         Eng
  baseball     JP         Jpn
  swimming     UK         Eng
  running      FR         Fra
Edward M.
  • 221
  • 1
  • 5
  • 13

1 Answers1

3

Using pandas, you can slice using multiple conditions, just watch out for tricky parenthesis on the syntax.

df = pd.DataFrame({'A': ['foo', 'bar', 'fooing', 'barball'],
                   'B': [1, 2, 3, 4]})

df_slice = df[(df.A.str.contains('ing')) | (df.A.str.contains('ball'))]

That should yield

df_slice
A        B
fooing   3
barball  4

If your goal is to slice in words that end in ing or ball, use endswith() instead of contains() in the conditions. Hope it helps!

Ricardo
  • 335
  • 2
  • 8
  • Thanks! Can I ask you one more? What if there are multiple conditions (let's say more than two)? Is it possible to do it like condition = [~,~,~,~], df.A.str.contains(condition)? – Edward M. Nov 20 '17 at 14:27
  • I am not sure I understand your question. `df.A.str.contains()` takes only a `str` as argument, not a condition. You can chain conditions with `&` and `|` and a few parenthesis on the slice, of course. The goal is to create a series with `True` or `False` elements and feed it to the dataframe, this will create the requested slice. – Ricardo Nov 20 '17 at 17:57
  • I see. Thanks a lot! – Edward M. Nov 21 '17 at 15:01