What is a fast and simple way to validate HTML from Java? I’m looking for an open-source/PD class (or set of classes) that describes the various properties of the 100-odd HTML tags, such as:
- Is the tag optional? Empty? Is it legal to omit its closing tag?
- Which other tags can this tag contain (if any)?
- Which attributes are legal for this tag, and what are their types? (not required, but nice to have)
Thanks!
EDIT
I'm looking to do to a tag-by-tag analysis of an HTML document, so I'm less interested in whether the document as a whole is valid, but rather what the specific requirements are for each type of tag. I could encode the rules based on the W3C spec, but wanted to see which ready-made solutions are available first.