I'm struggling with identifying matching expressions while crawling websites through the re module. I tried crawling multiple websites using Python and noticed that re module's findall function only returned multiple values (for example, expressions with the same class). Is there any way to return the string in an expression like the one below (stock price from cnn.com)? When I tried doing so, I only got an empty array
<span stream="last_36276" streamformat="ToHundredth" streamfeed="SunGard">109.95</span>
Here's my code for crawling cnn money for stock price of apple using Python 3.5.1
Any help is really appreciated:
import urllib.request
import re
with urllib.request.urlopen("http://money.cnn.com/quote/quote.html?symb=AAPL") as url:
s = url.read()
pattern = re.compile(b'<span stream="last_205778" streamformat="ToHundredth" streamfeed="SunGard">(.+?)</span>')
price=re.findall(pattern,s)
print(price)
#Searching for the first two expressions works, but the last one returns empty array
#<span title="2010-10-19 14:59:01Z" class="relativetime">Oct 19 10 at 14:59</span>
#<span itemprop="upvoteCount" class="vote-count-post ">45</span>
#<span stream="last_205778" streamformat="ToHundredth" streamfeed="SunGard">60.64</span>