0

I am trying to build create a list of the numbers stated in a string with specific format:

Paragraph 1, 2, 3 and 4 refer to blah-blah. Paragraph 5 refers to blah-blah.

The expected output should be two lists [1,2,3,4] and [5].

I've tried the following RegEx: r"(?:Section)\s+(\d)((?:,\s*\d)*)(?:\s*and\s*)?(\d)*", being stuck on the pattern for repeat match ((?:,\s*\d)*), returning Group 2 as , 2, 3:

Match 1: 1
Group 1: 1
Group 2: , 2, 3
Group 3: 4

Match 2: 1
Group 1: 5

What I'd like to get is:

Match 1: 1
Group 1: 1
Group 2: 2
Group 3: 3
Group 4: 4

Match 1: 1
Group 1: 5

Could someone please give me a hint?

Thank you.

Pimmy
  • 1
  • It is correct. This is how it works. To be able to access individual captures use PyPi regex and `match.captures(groupID)`. – Wiktor Stribiżew Dec 24 '21 at 17:04
  • How about just `Regex.Replace("[^P0-9 ]", "")` which will reduce the string to something like `"P 1 2 3 4 P 5` then you can split on P, then spaces – Caius Jard Dec 24 '21 at 17:12

0 Answers0