2

I am trying to insert a comment in html using beautiful soup, I want to insert it before head closure, I am trying something like this

soup.head.insert(-1,"<!-- #mycomment -->")

It's inserting before </head> but the value gets entity encoded &lt;!-- #mycomment --&gt;. Beautiful Soup documentation speaks about inserting a tag but how should I insert a comment as it is.

alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
DevC
  • 7,055
  • 9
  • 39
  • 58

1 Answers1

9

Instantiate a Comment object and pass it to insert().

Demo:

from bs4 import BeautifulSoup, Comment


data = """<html>
<head>
    <test1/>
    <test2/>
</head>
<body>
    test
</body>
</html>"""

soup = BeautifulSoup(data)
comment = Comment(' #mycomment ')
soup.head.insert(-1, comment)

print soup.prettify()

prints:

<html>
 <head>
  <test1>
  </test1>
  <test2>
  </test2>
  <!-- #mycomment -->
 </head>
 <body>
  test
 </body>
</html>
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
  • perfect thanks, one quick help is there any way to insert something before a tag start like .. { "insert here"..... } – DevC Mar 14 '14 at 13:02
  • @DevC try to insert it into the parent: `soup.head.parent.insert(0, comment)`. – alecxe Mar 14 '14 at 13:16
  • thanks but this soup.prettify() removes closures like . It removes the "/>", similarly for – DevC Mar 14 '14 at 13:18
  • Gotcha, see http://stackoverflow.com/questions/14961497/how-to-get-beautifulsoup-4-to-respect-a-self-closing-tag. – alecxe Mar 14 '14 at 13:20
  • ok this worked selfClosingTags=['link','meta'], but I have so many files to deal with.. let see.. btw thanks – DevC Mar 14 '14 at 13:27