0

I have a string:

a="12cdanfaw3i8hanjwofaef56ghah398hafadsf12cds;dkh38hfasdf56ghaldkshf12cdasdiuhf93f2asdf56gh"

I'm trying to extract a string between 12cd and 56gh. Those values would be anfaw3i8hanjwofaef, s;dkh38hfasdf, asdiuhf93f2asdf

The regex that I have is re.findall(r'12cd.*56gh', a).

But the patterns are included in the output.

How do I write the regex to not include it in the output?

Thanks

Jan
  • 42,290
  • 8
  • 54
  • 79
Carol Ward
  • 699
  • 4
  • 17

1 Answers1

4

You need a non-greedy regex to get all 3 matches, and you also need to use a matching group to not include the pattern, so use 12cd(.*?)56gh

import re
print(re.findall(r'12cd(.*?)56gh', '12cdanfaw3i8hanjwofaef56ghah398hafadsf12cds;dkh38hfasdf56ghaldkshf12cdasdiuhf93f2asdf56gh'))

Output:

['anfaw3i8hanjwofaef', 's;dkh38hfasdf', 'asdiuhf93f2asdf']

Explanation

12cd              // matches 12cd
    (             // matching group 1
      .*?         // matches any character between 0 and unlimited times, lazy
    )             
56gh              // matches 56gh
user3483203
  • 50,081
  • 9
  • 65
  • 94