For some reason, all of a sudden BeautifulSoup is not able to locate the content of any of my tags in a new Python script that I've begun. I have been using BeautifulSoup for about a year now, and have never encountered this problem.
I am able to successfully inject a JSON payload in Python with ".json()", pass that to BeautifulSoup using the html.parser and it wonderfully works every time.
I am now trying to read a MySql field that contains raw HTML, feed it as a text string into Python, and parse out and manipulate with BeautifulSoup, without any success.
I have gown down to trying to simply load a text string, like in this example, with the same negative result = not being able to find a tag, based on text-string-search (BeautifulSoup always returns = "None").
text_field = '<td><p></p><p></p><td><p>HELP text here 1<a href="some_URL_here"><ac:image ac:align="center" ac:layout="center" ac:original-height="153" ac:original-width="200"><ri:attachment ri:filename="image.png" ri:version-at-save="1"></ri:attachment></ac:image></a></p></td><p /><h2 style="text-align: center;"><a href="{some_URL_here}"><em><strong>Click here…</strong></em></a></h2></td>'
soup = BeautifulSoup(text_field, 'html.parser')
print(soup)
print (soup.prettify())
test = soup.find('td', text="HELP")
print(test)
The output from my "prettify" is parsed out properly by BeautifulSoup:
<td>
<p>
</p>
<p>
</p>
<td>
<p>
HELP text here 1
<a href="some_URL_here">
<ac:image ac:align="center" ac:layout="center" ac:original-height="153" ac:original-width="200">
<ri:attachment ri:filename="image.png" ri:version-at-save="1">
</ri:attachment>
</ac:image>
</a>
</p>
</td>
<p>
</p>
<h2 style="text-align: center;">
<a href="{some_URL_here}">
<em>
<strong>
Click here…
</strong>
</em>
</a>
</h2>
</td>
But no matter what I try, BeautifulSoup is ALWAYS returning "None" from any find request.
Am I missing something obvious here?
HELP text here 1..." but I've tried with many other tags & values