I'm trying to get BeautifulSoup
capture a list of all of the location names through scraping, I used to use the following:
locs = LOOPED.findAll("td", {"class": "max use"})
Which used to work for the HTML
<td class="max use" style="">London</td>
However the HTML has changed to and it's no longer returning London
<td class="max use" style="">
<div class="notranslate">
<span><a data-title="View Location" href="/location/uk/gb/london/">London</a></span> <span class="extra hidden">(DEFAULT)</span>
</div>
</td>
Edit: If I print locs, I get a list like:
<td class="max use" style="">\n<div class="notranslate">\n<span><a data-title="View Location" href="/location/uk/gb/london/">London</a></span> <span class="extra hidden">(DEFAULT)</span>\n</div>\n</td>, <td class="max use" style="">\n<div class="notranslate">\n<span><a data-title="View Location" href="/location/uk/gb/manchester/">Manchester</a></span> <span class="extra hidden">(DEFAULT)</span>\n</div>\n</td>, <td class="max use" style="">\n<div class="notranslate">\n<span><a data-title="View Location" href="/location/uk/gb/liverpool/">Liverpool</a></span> <span class="extra hidden">(NA)</span>\n</div>\n</td>]
Which as you can see has 3 different locations, from the above I would expect to see a list of [London, Manchester, Liverpool]
I thought that I should be using something like:
locs = LOOPED.findAll("td", {"class": "max use"})
locs = locs.findAll('a')[1]
print locs.text
But this only retuns with
AttributeError: 'ResultSet' object has no attribute 'findAll'
I can't work out how to get the Beautifulsoup
to re-search for the a hyperlink text...