1

I have following data frame

>>> df = pd.DataFrame(['as,df','as.df'])
>>> df
       0
0  as,df
1  as.df

I would like to filter above dataFrame using string with exact match except for the case. I tried following way but it is not able to differenciate between . and ,

>>> df[0].str.match('^As.df+$', case=False)
0     True
1     True
Name: 0, dtype: bool

Can you please help in resolving this issue.

user26249
  • 43
  • 6

1 Answers1

1

Use a backslash to escape the dot: '^As\.df+$'.

>> df[0].str.match('^as\.df$', case=False)
0     True
1    False
Name: 0, dtype: bool

To see when (and how) to escape special characters in regex, see this question: What special characters must be escaped in regular expressions?


If the regular expression is not under your control, then you can use re.escape before adding some characters yourself to make sure that no accidental dots, square brackets or other special characters make it to the string being searched.

Community
  • 1
  • 1
musically_ut
  • 34,028
  • 8
  • 94
  • 106
  • Thanks for the answer, It helps partially. But I do not have control on the string to be searched for, So I cannot manually edit the string apart from adding regex characters at the beginning and end of the string. Some times the search string could 42.1.1.p4 or 42,1,1.P4 etc., To follow your suggestion, I must edit the string first by replacing . in the at everyplace string with \. – user26249 Sep 01 '15 at 11:43
  • @user26249 Updated the answer. Use `re.escape` before adding `^`, `+`, '$`', etc. – musically_ut Sep 01 '15 at 11:46
  • Thanks. That answers my question. – user26249 Sep 01 '15 at 11:51