I want to validate HTML tags and it contents using java. Validation should make sure all the html tags are closed properly. There is no mistake in the tag creation area. For eg
<div id="divIdvalue'></div>
or
<span id\="spanIdval" ,></span>
I need to validate such kind of things. while googling I got a regular expression like this
<(\"[^\"]*\"|'[^']*'|[^'\">])*>
But it wont validate all the HTMLs are closed or not? So how can I add that also with this.
My sample code is attached below. Please help me.
package com.test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class htmlValidator {
private static Pattern pattern;
private static Matcher matcher;
private static final String HTML_TAG_PATTERN = "<(\"[^\"]*\"|'[^']*'|[^'\">])*>";
public void HTMLTagValidator(){
pattern = Pattern.compile(HTML_TAG_PATTERN);
}
public static boolean validate(final String tag){
matcher = pattern.matcher(tag);
return matcher.matches();
}
/**
* @param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
String htmlStr = "<div> <p id=/'bb'>This is first paragraph. This is first paragraph. </p> <span id='spanId'>Yes this is spab</span></div>";
System.out.println("htmlStr :- "+htmlStr);
validate(htmlStr);
}
}