I have the following code that gets the a href tags urls from an XML which is working correctly:
Pattern p = Pattern.compile("<a[^>]+href\\s*=\\s*['\"]([^'\"]+)['\"][^>]*>");
Matcher m = p.matcher(xmlString);
while (m.find())
imagesURLs.add(m.group(1));
I have the following:
<a href="http://...">some text</a>
The top code gets me <a href="http://...">
in m.group(0)
and http://...
in m.group(1)
.
I also want to get the full <a href="http://...">some text</a>
.
How can achieve this by modifying the regex?