I am scraping a website using python and BeautifulSoup. I was able to find all the tds on the page with the command:
data = soup.find_all('td')
Then I find the first individual td that I need to use:
td = data[19]
If I print this td the output is:
<td data-geoid="0617568" data-isnumeric="1" data-srcnote="true" data-value="18.8">
<span data-title="Culver City city, California"></span><div class="qf-sourcenote">
<span></span><a title="Source: 2018 American Community Survey (ACS), 5-year estimates. Estimates are not comparable to other geographic levels due to methodology differences that may exist between different data sources."></a>
</div>18.8%</td>
Now I want to extract the data that is between the end of the div and the end of the td, so the 18.8%. I used this post to try to extract it with the following code:
m = re.search('</div>(.+?)</td>', td)
This gives me the following error:
Traceback (most recent call last):
File "/Users/Alfie/PycharmProjects/474scrape/srape.py", line 18, in <module>
m = re.search('</div>(.+?)</td>', td)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/re.py", line 183, in search
return _compile(pattern, flags).search(string)
TypeError: expected string or bytes-like object
I think the problem is with escape characters or something similar that are in the markers I am using. Any help is appreciated