I am trying to create a list top 10 news articles from BBC's most read section. The code I have is as below:
from bs4 import BeautifulSoup, SoupStrainer
import urllib2
import re
opener = urllib2.build_opener()
url = 'http://www.bbc.co.uk/news/popular/read'
soup = BeautifulSoup(opener.open(url), "lxml")
titleTag = soup.html.head.title
print(titleTag.string)
tagSpan = soup.find_all("span");
for tag in tagSpan:
print(tag.get("class"))
What I am looking for is the string between <span class="most-popular-page-list-item__headline">
and </span>
How do I get the string and make a list of such strings?