Python: find a string between the first occurrence of two substrings

Question

I have a text like this:

text='gn="right" headers="gr-Y10 gr-eps i36">121.11<\\/td><\\/tr><tr class="hr"><td colspan="12"><\\/td><\\/tr><tr>'

I would like to get the value 121.11 using regex out of it. So I did this:

import re
b=re.search('gr-Y10 gr-eps i36">(.*)<\\\\/td', text)
b.group(1)

and I got this as output:

'121.11<\\/td><\\/tr><tr class="hr"><td colspan="12">'

How can I get what I am really looking for, which is 121.11 instead of the line above?

From the last question you've asked I suspect that the input is HTML, correct? In this case, you need an HTML parser, better don't parse it with regexes. — alecxe, Mar 11 '15 at 05:45

score 8 · Accepted Answer · answered Mar 11 '15 at 05:43

8

gr-Y10 gr-eps i36">(.*?)<\\\\/td

                      ^^

make your * non greedy by appending ?.By making it non greedy it will stop at the first instance of <\\\\/td else it will capture upto last <\\\\/td.

See demo.

https://regex101.com/r/iS6jF6/2#python

answered Mar 11 '15 at 05:43

vks

67,027
10
91
124

thanks that worked. Also is then anythin I can use instead of \\\\ to match \\? – TJ1 Mar 11 '15 at 05:45
1

@TJ1 `gr-Y10 gr-eps i36">(.*?)<\\+/td` or use `r` mode – vks Mar 11 '15 at 05:48
1

@TJ1 `print re.findall(r'gr-Y10 gr-eps i36">(.*?)<\\+/td',x)` here x is your test_strin – vks Mar 11 '15 at 05:50

score 5 · Answer 2 · edited May 23 '17 at 12:28

5

Knowing the source of the input data and taking into account it is HTML, here is a solution involving an HTML Parser, BeautifulSoup:

soup = BeautifulSoup(input_data)

for row in soup.select('div#tab-growth table tr'):
    for td in row.find_all('td', headers=re.compile(r'gr-eps')):
        print td.text

Basically, for every row in the "growth" table, we are finding the cells with gr-eps in headers ("EPS %" part of the table). It prints:

60.00
—
—
—
—
42.22
3.13
—
—
—
-498.46
...

This is a good read also.

edited May 23 '17 at 12:28

Community

1
1

answered Mar 11 '15 at 05:54

alecxe

462,703
120
1,088
1,195

2

@TJ1 thanks for getting me closer to the regex silver badge through providing non-regex answers to regex topics :) – alecxe Mar 11 '15 at 06:15
hahaha!!!!!!!!!seems like somebody has exploited a loophole :P – vks Mar 11 '15 at 12:02
1

@vks exactly, completely unintentionally :) – alecxe Mar 11 '15 at 13:50

Python: find a string between the first occurrence of two substrings

2 Answers2