0

So I need to grab the numbers after lines looking like this

<div class="gridbarvalue color_blue">79</div> 

and

<div class="gridbarvalue color_red">79</div> 

Is there a way I can do a findAll('div', text=re.recompile('<>)) where I would find tags with gridbarvalue color_<red or blue>?

I'm using beautifulsoup.

Also sorry if I'm not making my question clear, I'm pretty inexperienced with this.

nhahtdh
  • 55,989
  • 15
  • 126
  • 162

2 Answers2

1

class is a Python keyword, so BeautifulSoup expects you to put an underscore after it when using it as a keyword parameter

>>> soup.find_all('div', class_=re.compile(r'color_(?:red|blue)'))
[<div class="gridbarvalue color_blue">79</div>, <div class="gridbarvalue color_red">79</div>]

To also match the text, use

>>> soup.find_all('div', class_=re.compile(r'color_(?:red|blue)'), text='79')
[<div class="gridbarvalue color_blue">79</div>, <div class="gridbarvalue color_red">79</div>]
Peter Gibson
  • 19,086
  • 7
  • 60
  • 64
0
import re
elems = soup.findAll(attrs={'class' : re.compile("color_(blue|red)")})
for each e in elems:
    m = re.search(">(\d+)<", str(e))
    print "The number is %s" % m.group(1)
Andrew Luo
  • 919
  • 1
  • 5
  • 6