I am trying to make a regular expression for HTML tags.
The regex I've created so far is <(/?)(\w+?)(\s(.*?))*?((/>)|>)
, when I tested it online it worked perfectly; but when I tested it using Java regex it sometimes throws StackOverFlowError and sometimes it doesn't.
I'am using this code for testing :
public static void parseHtml(String urlString){
new Thread(new Runnable() {
@Override
public void run() {
int count = 0;
int count2 = 0;
String htmlScript = downloadWebPage(urlString);
Matcher matcher = Pattern.compile("<(/?)(\\w+?)(\\s(.*?))*?((/>)|>)",
Pattern.DOTALL).matcher(htmlScript);
while(matcher.find()) {
System.out.println(matcher.group());
}
}
}).start();
}
So, my question is : Why does Java's regex engine throws StackOverFlowError sometimes and sometimes it doesn't?
Note: I used the same test input (The same URL), and it threw the error, and tested it again later and it worked nicely.