95

I have the following bs4 object listing:

>>> listing
<div class="listingHeader">
<h2>
....


>>> type(listing)
<class 'bs4.element.Tag'>

I want to extract the raw html as a string. I've tried:

>>> a = listing.contents
>>> type(a)
<type 'list'>

So this does not work. How can I do this?

alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
user1592380
  • 34,265
  • 92
  • 284
  • 515

1 Answers1

174

Just get the string representation:

html_content = str(listing)

This is a non-prettified version.

If you want a prettified one, use prettify() method:

html_content = listing.prettify()
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
  • 1
    Is there a way to turn it into a unicode string I'm getting an error: "WebDriverException: Message: u'missing ; before statement' " – user1592380 Sep 08 '14 at 17:34
  • 4
    I was struggling with special characters like umlaut ä,ö,ü. One might want to use `soup.prettify( formatter="html" )` - compare https://www.crummy.com/software/BeautifulSoup/bs4/doc/#output-formatters – BadAtLaTeX Sep 14 '18 at 14:31
  • I am getting \n\t\r when I type cast tag object to str. – raviraj Oct 11 '21 at 13:09