regex finding number after first ocurance of substring

Question

I have a sentence:

"Fourth-quarter 2021 net earnings per share (EPS) of $1.26, compared with 2020 EPS of $1.01; Fourth-quarter 2021 adjusted EPS of $1.11, down 25.5 percent compared with 2020 adjusted EPS of $1.49"

and would like to get number $1.11 after the first substring "adjusted EPS".

The best regex formula I could come with is:

re.search("^.*Adjusted EPS.*?(\$\d+.\d+).*", text,re.IGNORECASE).group(1)

but this gives me number $1.49 after second occurrence of "adjusted EPS".

How can I modify the search so I get the number $1.11?

score 0 · Answer 1 · answered Mar 28 '22 at 17:07

This regex string should work. /adjusted EPS of ?(\$\d+.\d+)/g

Input:

Fourth-quarter 2021 net earnings per share (EPS) of $1.26, compared with 2020 
EPS of $1.01; Fourth-quarter 2021 adjusted EPS of $1.11, down 25.5 percent 
compared with 2020 adjusted EPS of $1.49

Output: adjusted EPS of $1.11, adjusted EPS of $1.49

Edit: Remove the g at the end of the Regex string to only find one match.

score 0 · Answer 2 · answered Mar 28 '22 at 17:18

You could use this pattern which looks for "adjusted EPS" and only allows one "$" between it and the end of the line.

/adjusted EPS[^\$]+(\$\d+\.\d+)[^\$]+$/gm

the pattern without the endings is

adjusted EPS[^\$]+(\$\d+\.\d+)[^\$]+$

score -1 · Accepted Answer · answered Mar 28 '22 at 17:09

-1

The problem here is greedy regex which you use just in the beginning:

^.*Adj ...

^ means the start of the string. Being greedy, .* "eats" as much characters as possible up until the last "adjusted EPS"

There're two solutions here, either make it non-greedy (i.e. lazy) ^.*?Adj ..., or remove ^.* completely - I see no use of it here

answered Mar 28 '22 at 17:09

nicael

18,550
13
57
90

Note it's `^.*adj...`. Nor does `.*` at the end serve a purpose. Perhaps `\badjusted EPS\b.*?(\$\d+.\d{2})`, the word boundaries to avoid matching, for example, `"readjusted EPS"` (probably not needed but does no harm). [Demo](https://regex101.com/r/8dDH5J/1) – Cary Swoveland Mar 28 '22 at 18:58

regex finding number after first ocurance of substring

3 Answers3