57

I want to delete the specific div from soup object.
I am using python 2.7 and bs4.

According to documentation we can use div.decompose().

But that would delete all the div. How can I delete a div with specific class?

evandrix
  • 6,041
  • 4
  • 27
  • 38
Riken Shah
  • 3,022
  • 5
  • 29
  • 56

4 Answers4

111

Sure, you can just select, find, or find_all the divs of interest in the usual way, and then call decompose() on those divs.

For instance, if you want to remove all divs with class sidebar, you could do that with

# replace with `soup.findAll` if you are using BeautifulSoup3
for div in soup.find_all("div", {'class':'sidebar'}): 
    div.decompose()

If you want to remove a div with a specific id, say main-content, you can do that with

soup.find('div', id="main-content").decompose()
lemonhead
  • 5,328
  • 1
  • 13
  • 25
  • Short and precise. Thanks! – mr_mo Apr 04 '19 at 13:18
  • @lemonhead Do you by any chance know how to put a replacement text at the decomposed location? – CodeGuru Apr 27 '19 at 14:18
  • @CodeGuru in that case you wouldn't decompose, you would select/find the element and then call [`elem.string.replace_with`](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#replace-with). See also [this answer](https://stackoverflow.com/questions/15056633/python-find-text-using-beautifulsoup-then-replace-in-original-soup-variable) – lemonhead Apr 30 '19 at 07:06
  • 3
    d=div.extract() if you want to get the removed element as d, and do something further. – nikhil swami Sep 20 '20 at 16:19
13

This will help you:

from bs4 import BeautifulSoup

markup = '<a>This is not div <div class="1">This is div 1</div><div class="2">This is div 2</div></a>'
soup = BeautifulSoup(markup,"html.parser")
a_tag = soup

soup.find('div',class_='2').decompose()

print a_tag

Output:

<a>This is not div <div class="1">This is div 1</div></a>

Let me know if it helps

Vineet Kumar Doshi
  • 4,250
  • 1
  • 12
  • 20
12

Hope it help:

from bs4 import BeautifulSoup
from bs4.element import Tag

markup = '<a>This is not div <div class="1">This is div 1</div><div class="2">This is div 2</div></a>'
soup = BeautifulSoup(markup,"html.parser")

for tag in soup.select('div.1'):
  tag.decompose()

print(soup)
david euler
  • 714
  • 8
  • 12
-1
    from BeautifulSoup import BeautifulSoup
    >>> soup = BeautifulSoup('<body><div>1</div><div class="comment"><strong>2</strong></div></body>')
    >>> for div in soup.findAll('div', 'comment'):
    ...   div.extract()
    ... 
    <div class="comment"><strong>2</strong></div>
    >>> soup
    <body><div>1</div></body>
3ppps
  • 933
  • 1
  • 11
  • 24