6
df=pd.DataFrame({"A":["one","two","three"],"B":["fopur","give","six"]})

when I do,

df.B.str.contains("six").any()
out[2]=True

when I do,

df.B.str.contains("six)").any()

I am getting the below error,

C:\ProgramData\Anaconda3\lib\sre_parse.py in parse(str, flags, pattern)
    868     if source.next is not None:
    869         assert source.next == ")"
--> 870         raise source.error("unbalanced parenthesis")
    871 
    872     if flags & SRE_FLAG_DEBUG:

error: unbalanced parenthesis at position 3

Please help!

Nisarg Shah
  • 14,151
  • 6
  • 34
  • 55
Pyd
  • 6,017
  • 18
  • 52
  • 109

2 Answers2

10

You can set regex=False in in pandas.Series.str.contains:

df.B.str.contains("six)", regex=False).any()

If you want to match irrespective of case,

df.B.str.contains("Six)", case=False, regex=False).any() 
out[]: True

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.contains.html

Info:

Parenthesis are special characters in regular expressions that need to be "escaped", see for example here or here.

Joe
  • 6,758
  • 2
  • 26
  • 47
8

You need escape ) by \ because special regex character:

df.B.str.contains("six\)").any()

More general:

import re

df.B.str.contains(re.escape("six)")).any()
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • actually I am passing a list one by one to check if it is exists, so I cannot hardcode this escape like this, my actual code is like, `for item in mylist: if df.B.str.contains(item): print(item)` – Pyd Feb 09 '18 at 06:27
  • @pyd. How does that prevent you from properly escaping the strings? – Mad Physicist Feb 09 '18 at 06:33
  • Yes, it is difference. Joe solution dont escape and by default all regex are not compiled (I hope it is good wording). So if use some regex, my solution working and Joe solution not. – jezrael Feb 09 '18 at 06:34
  • 1
    But if dont want use regex, then need Joe solution, if need regex (e.g. | for join strings) need my solution. – jezrael Feb 09 '18 at 06:35
  • Thank you @jezrael – Pyd Feb 09 '18 at 06:38
  • Need `df.B.str.contains("Six", case= False).any()`, untested, because on phone only. – jezrael Feb 16 '18 at 05:12