-2

I am receiving response from external service in html format and pass it directly to my front end. However, sometime external system returns broken html, which can lead to the broken page on my site. Thence, I want to validate this html response whether it is broken or valid. If it is valid I will pass it further, otherwise it will be ignored with error in log.

By what means can I make validation on back-end in Java?

Thank you.

Alexander Drobyshevsky
  • 3,907
  • 2
  • 20
  • 17

2 Answers2

1

I believe there is no such "generic" thing available in Java. But you can build your own parser to validate the HTML using any one Open Source HTML Parser

Techidiot
  • 1,921
  • 1
  • 15
  • 28
0

I found the solution:

private static boolean isValidHtml(String htmlToValidate) throws ParserConfigurationException, 
        SAXException, IOException {
    String docType = "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" " +
            "\"https://www.w3.org/TR/xhtml11/DTD/xhtml11-flat.dtd\"> " +
            "<html xmlns=\"http://www.w3.org/1999/xhtml\" " + "xml:lang=\"en\">\n";

    try {
        InputSource inputSource = new InputSource(new StringReader(docType + htmlToValidate));

        DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
        domFactory.setValidating(true);
        DocumentBuilder builder = domFactory.newDocumentBuilder();
        builder.setErrorHandler(new ErrorHandler() {
            @Override
            public void error(SAXParseException exception) throws SAXException {
                throw new SAXException(exception);
            }

            @Override
            public void fatalError(SAXParseException exception) throws SAXException {
                throw new SAXException(exception);
            }

            @Override
            public void warning(SAXParseException exception) throws SAXException {
                throw new SAXException(exception);
            }
        });

        builder.parse(inputSource);
    } catch (SAXException ex) {
        //log.error(ex.getMessage(), ex); // validation message
        return false;
    }

    return true;
}

This method can be used this way:

  String htmlToValidate = "<head><title></title></head><body></body></html>";

  boolean isValidHtml = isValidHtml(htmlToValidate);
Alexander Drobyshevsky
  • 3,907
  • 2
  • 20
  • 17