How should I go about finding the depth of this block -
<div>
<div>
<div>
</div>
</div>
</div>
In this case, it should be 3. Any clue/code will help.
How should I go about finding the depth of this block -
<div>
<div>
<div>
</div>
</div>
</div>
In this case, it should be 3. Any clue/code will help.
There are many ways to do this. I would not recommend using a regex to parse XML though.
One way is to use the HTMLParser
that comes standard with Python:
from HTMLParser import HTMLParser
class MyHTMLParser(HTMLParser):
def __init__(self):
HTMLParser.__init__(self)
self.depth = 1
def handle_starttag(self, tag, attrs):
print 'Encountered %s at depth %d.' % (tag, self.depth)
self.depth += 1
def handle_endtag(self, tag):
self.depth -= 1
if __name__ == '__main__':
html = '''
<div>
<div>
<div>
</div>
</div>
</div>
'''
MyHTMLParser().feed(html)
Running this script produces:
Encountered div at depth 1.
Encountered div at depth 2.
Encountered div at depth 3.