Regex to extract multiple numbers with decimal

Question

I have the following group of numbers:

SalesCost% Margin
2,836,433.182,201,355.6422.39

Expected Result:

I want to separate this and extract the numbers such that I get the result as shown below:

2,836,433.18
2,201,355.64
22.39

Attempt

I tried the (\d+)(?:\.(\d{1,2}))? regex but this only extracts the number until the first decimal, i.e. I only get 2,836,433.18.

Question

Is there a way I can extract the numbers using Regex (or alternatively someway through Python) to get the results shown above?

score 2 · Accepted Answer · answered Dec 14 '21 at 19:18

2

You can use

re.findall(r'\d{1,3}(?:,\d{3})*(?:\.\d{1,2})?', text)
re.findall(r'(?:\d{1,3}(?:,\d{3})*|\d+)(?:\.\d{1,2})?', text)

Details:

The (?:\d{1,3}(?:,\d{3})*|\d+)(?:\.\d{1,2})? variation supports numbers like 123456.12, i.e. no digit grouping symbol containing integer parts.

answered Dec 14 '21 at 19:18

Wiktor Stribiżew

Dear Wiktor, may I ask what is the purpose of using non capturing group here? – Anoushiravan R Dec 15 '21 at 21:20
2

@AnoushiravanR Due to the fact `re.findall` returns (list of) tuples if a regex pattern contains a capturing group (groups) a non-capturing group is a common way to work around this problem. See [re.findall behaves weird](https://stackoverflow.com/a/31915134/3832970). You may also be interested in [R's equivalent of Python's re.findall](https://stackoverflow.com/a/43401685/3832970). – Wiktor Stribiżew Dec 15 '21 at 22:08

1 Answers1