0

I want to enter complete html throgh string and then check is the given sting is a valid html or not.

Public booleanisValidHTML(String htmlData)

Description-Checks whether a given HTML data is a valid HTML data or not

htmlData- A HTML document in the form of string which contains TAGS and data.

returns-true if the given htmlData contains all valid tags with their allowed attributes and their possible values, otherwise false. A valid HTML:

<html>
    <head>
        <title>Page Title</title>
    </head>
 <body>
  <table style="width:100%">
     <tr>
             <td>Jill</td>
             <td>Smith</td>
             <td>50</td>
     </tr>
     <tr>
           <td>Eve</td>
           <td>Jackson</td>
           <td>94</td>
     </tr>
   </table>
  <b>This text is bold</b>
  </body>
  </html>


   The java code should look like

class htmlValidator{
public static void main(String args[]){
Scanner in =new Scanner(System.in);
String html=new String("pass the html here'');
  isValidHtml(html)
  }
      public static boolean isValidHtml(String html){
      /** write code here**/
      /** method returns true if the given html is valid **
       //**please help**/
        }

}

  • 1
    Use a DOM parser. – Mordechai Jan 18 '17 at 02:23
  • 2
    possible duplicate :http://stackoverflow.com/questions/4392505/how-to-validate-html-from-java – thetraveller Jan 18 '17 at 02:24
  • If you want to define your own rules for what you accept as valid HTML, you not only need to be an expert there, you also need to read a book on writing parsers, this maybe the greater of the challenges. You will learn a lot that could be useful to know at other times, though, so happy studying. – Ole V.V. Jan 18 '17 at 04:35
  • See also [Creating a parser for a simple pseudocode language?](http://stackoverflow.com/questions/9957873/creating-a-parser-for-a-simple-pseudocode-language) and [writing parser of a simple language](http://stackoverflow.com/questions/34432136/writing-parser-of-a-simple-language). An answer to the latter refers to ANTLR, not a tool I know, may be worth checking out. – Ole V.V. Jan 18 '17 at 04:39
  • My search came across a series of articles on parser writing in Java: [Writing a Parser in Java](http://cogitolearning.co.uk/?p=523). – Ole V.V. Jan 18 '17 at 04:42

1 Answers1

2

Rather than writing regex to parse and check (which is generally A Bad Idea), you're better off using something like jsoup to parse it and check for errors.

From https://jsoup.org/cookbook/input/parse-document-from-string:

String html = "<html><head><title>First parse</title></head>"
    + "<body><p>Parsed HTML into a doc.</p></body></html>";
Document doc = Jsoup.parse(html);
Community
  • 1
  • 1