0

i'm struggling to debug my python code with regex in PyCharm.

The idea: I want to find any case of 'here we are', which can go with or without 'attention', and the word 'attention' can be separated by whitespace, dot, comma, exclamation mark.

I expect this expression should do the job

r'(attention.{0,2})?here we are'

Online services like https://regex101.com/ and https://pythex.org/ confirm my expression is correct – and i'm getting expected "attention! here we are"

However, if i run the below code in PyCharm I'm getting such (unexpected for me) result.

my_string_1 = 'attention! here we are!'
my_list = re.findall(r'(attention.{0,2})?here we are', my_string_1)
print(my_list)

>>> ['attention! ']

Could someone direct me to the reason why PyCharm's outcome is different? Thanks

Mike
  • 1
  • 1

2 Answers2

0

If there are any capturing groups in the match, re.findall will return them instead of the entire match. Thus, only the content matched with (attention.{0,2}) will be returned.

In order to avoid that, you can use a non-capturing group instead.

r'(?:attention.{0,2})?here we are'
Cubix48
  • 2,607
  • 2
  • 5
  • 17
  • oh, many thanks, it was a feature of re.findall – i haven't consider it. thanks a lot, it helped! @Cubix48 – Mike Mar 11 '22 at 18:50
0

The following will give you a similar answer to the online tools:

pat = '(attention.{0,2})?here we are'
x = re.search(pat, my_string_1)
print(x.group())
attention! here we are

search vs findall work differently. Not something that was immediately obvious when I first started learning regex.

Marcel Wilson
  • 3,842
  • 1
  • 26
  • 55