1

I have a code to remove the text which is in head tag. Soup us the html of a website

    for link in soup.findAll('head'):
        link.replaceWith("")

I am trying to replace the entire content with "". However this is not working. How can i remove all text between head tags from soup completely.

jwarner112
  • 1,492
  • 2
  • 14
  • 29
user2878953
  • 99
  • 1
  • 5

2 Answers2

1

Try this:

[head.extract() for head in soup.findAll('head')]
sdanzig
  • 4,510
  • 1
  • 23
  • 27
0

You need to use """ (3 quotes), where you appear to be using only two.

Example:

"""
This block
is commented out
"""

Happy coding!

EDIT: This is not what the user was asking, my apologies.

I'm not experienced with Beautiful Soup, but I found a snippet of code on SO that might work for you (source):

soup = BeautifulSoup(source.lower())
to_extract = soup.findAll('ahref') #Edit the stuff inside '' to change which tag you want items to be removed from, like 'ahref' or 'head'
for item in to_extract:
    item.extract()

By the look of it, it might just remove every link on your page, though.

I'm sorry if this doesn't help you more!

Community
  • 1
  • 1
jwarner112
  • 1,492
  • 2
  • 14
  • 29