lxml.html is a dedicated python package for dealing with HTML.
lxml.html
is a dedicated python package for dealing with HTML. It is based on lxml's HTML parser, but provides a special Element API for HTML elements, as well as a number of utilities for common HTML processing tasks.