I would like to use BeautifulSoup in Python to parse html from such html
<p><b>Background</b><br />x0</p><p>x1</p>
<p><b>Innovation</b><br />x2</p><p>x3</p><p>x4</p>
<p><b>Activities</b><br />x5</p><p>x6</p>"
to this result:
Background: x0, x1
Innovation: x2, x3, x4
Activities: x5, x6
I have tired to use the python scripts below:
from bs4 import BeautifulSoup
htmltext = "<p><b>Background</b><br />x0</p><p>x1</p>
<p><b>Innovation</b><br />x2</p><p>x3</p><p>x4</p>
<p><b>Activities</b><br />x5</p><p>x6</p>"
html = BeautifulSoup(htmltext)
for n in html.find_all('b'):
title_name = n.next_element
title_content = n.nextSibling.nextSibling
print title_name, title_content
However, I can only get this:
Background: x0
Innovation: x2
Activities: x5
Your comments are welcome and your suggestions will be appreciated.
` element in between successive `` elements?
– Patrick Collins Aug 23 '13 at 21:20