I'm trying to extract the text within the title elements and ignore everything else.
I've looked at these articles, but they don't seem to help :\
Regular expression to extract text between square brackets
String Pattern Matching In Java
Java Regex to get the text from HTML anchor (<a>...</a>) tags
The main problem is I am not able to understand what the responders are saying while trying to hack up my own code.
Here is what I've managed from reading the Java API in the Pattern article.
<title>(.*?)</title>
Here's my code to return the title.
String title = null;
Matcher match = Pattern.compile("[<title>](.*?)[</title>]").matcher(this.webPage);
try{
title = match.group();
}
catch(IllegalStateException e)
{
e.printStackTrace();
}
I am getting the IllegalStateException, which says this:
java.lang.IllegalStateException: No match found
at java.util.regex.Matcher.group(Matcher.java:485)
at java.util.regex.Matcher.group(Matcher.java:445)
at BrowserModal.getWebPageTitle(BrowserModal.java:21)
at BrowserTest.main(BrowserTest.java:7)
Line 21 would be "title = match.group();"