How to get HTML from a beautiful soup object

Question

I have the following bs4 object listing:

>>> listing
<div class="listingHeader">
<h2>
....


>>> type(listing)
<class 'bs4.element.Tag'>

I want to extract the raw html as a string. I've tried:

>>> a = listing.contents
>>> type(a)
<type 'list'>

So this does not work. How can I do this?

score 174 · Accepted Answer · answered Sep 08 '14 at 17:16

174

html_content = str(listing)

This is a non-prettified version.

If you want a prettified one, use prettify() method:

html_content = listing.prettify()

answered Sep 08 '14 at 17:16

alecxe

1

Is there a way to turn it into a unicode string I'm getting an error: "WebDriverException: Message: u'missing ; before statement' " – user1592380 Sep 08 '14 at 17:34
4

I was struggling with special characters like umlaut ä,ö,ü. One might want to use `soup.prettify( formatter="html" )` - compare https://www.crummy.com/software/BeautifulSoup/bs4/doc/#output-formatters – BadAtLaTeX Sep 14 '18 at 14:31
I am getting \n\t\r when I type cast tag object to str. – raviraj Oct 11 '21 at 13:09

1 Answers1