I'm trying to read something from within HTML tags and I'm completely stupid when it comes to Regular Expressions (I've though of a few patters and none seem to work).
I'm reading a web page, looking this line: <td title='Visit Page for Demilict'><a href='personal.php?name=Demilict&c=s' class='idk' rel='Demilict' style='color: teal;'>Demilict</a></td>
I need to extract 'Demilict' from there, and there's 3 opportunities to do so as you can see.
Which would be the best position to extract it from and how would I achieve that?
I'm using this to find the name(s) as well, as there is around 60 different names I need to extract and they're all using the same format, except the name can only contain letters numbers and underscores.
public void parse(String list) {
try {
URL url = new URL(list);
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(url.openStream()));
String line;
StringBuilder stringBuilder = new StringBuilder();
while ((line = bufferedReader.readLine()) != null) {
stringBuilder.append(line).append("\n");
}
System.out.println(stringBuilder.toString());
Matcher matcher = namePattern.matcher(stringBuilder.toString());
if (matcher.find()) {
System.out.println("matched: " + matcher.group());
}
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}