I encounter a problem, it might be very easy, but I didn't saw it on document.
Here is the target html structure, very simple.
<h3>Top
<em>Mid</em>
<span>Down</span>
</h3>
I want to get the "Top" text which was inside the h3
tag, and I wrote this
from bs4 import BeautifulSoup
html ="<h3>Top <em>Mid </em><span>Down</span></h3>"
soup = BeautifulSoup(html)
print soup.select("h3")[0].text
But it will return Top Mid Down
, how do I modify it?