How to Parse contents of HTML tag which is inside an HTML tag with BeautifulSoup?

Question

In a unique case of html found on web there is a html document which has multiple html tags within the parent HTML tag. I want to parse the contents of the html tag. Can anyone point me in the direction to do so ?

Thanks in advance.

Edit 1: Using BeautifulSoup

soup = BeautifulSoup(html, "lxml")

gives only the parent html and the tags present within it.

However I am assuming if the browser is able to render the html BS should be able to parse it. is that assumption correct?

Edit 2: Actually the html is a malformed html ( i am assuming here), this is the html I am parsing with beautifulsoup somehow I am only getting the tables and and of 1st (outermost) html. If I manually remove the multiple HTML tags and only keep 1 html tag I am able to parse the table in BS. So the question is "Is there any way to parse the below html and get the data from the innermost or all tables in the file?

<!DOCTYPE html>
<html>
<head>
    <title>Some Title</title>
</head>
<body>
    some html to display the tables.
    <html>
        <head></head>
        <title>Some other title</title>
        <body>
            some html to display even more tables.
        </body>
    </html>
</body>
</html>

It would help if you gave the URL and explained what you are trying to extract from it. — Martin Evans, Jun 05 '17 at 07:38

score 0 · Answer 1 · answered Jun 05 '17 at 07:20

0

here is a sample code, you can use for finding text of particular inside a particular kind of html tag

soup2 = BeautifulSoup(x, 'html.parser')
    for i in soup2.find_all('ul', attrs={'class': 'results-base'}):
         for j in i.find_all('li'):

answered Jun 05 '17 at 07:20

nishant kumar

507
10
28

I have updated the question to contain more details could you please comment on that? Thanks in advance. – Kaustubh Jun 26 '17 at 08:45

score 0 · Answer 2 · answered Jun 06 '17 at 13:12

0

Here are some sites that are relevant for your question,i think you can find a good answer for what you're looking for.

answered Jun 06 '17 at 13:12

Mika Wolf

102
4

can you add some code example, as to how to solve the problem? – a3.14_Infinity Jun 06 '17 at 13:25

How to Parse contents of HTML tag which is inside an HTML tag with BeautifulSoup?

2 Answers2