1

I have this regex:

<a href(.*foo.bar.*)a>

For this string, it gives me only 1 match, but I need it to give 3 matches.

<a href="https://foo.bar/1">First</a> RANDOM TEXT COULD BE HERE <a href="https://foo.bar/2">Second</a> RANDOM TEXT COULD BE HERE <a href="https://foo.bar/3">Third</a>

So each a href should be individual.

How could I accomplish this?

EDIT:

This code searches for matches:

Pattern pattern = Pattern.compile("<a href(.*foo.bar.*)a>");
Matcher matcher = pattern.matcher(body);
List<String> matches = new ArrayList<String>();
while (matcher.find()) {
    matches.add(matcher.group());
}
dda
  • 6,030
  • 2
  • 25
  • 34
Jaanus
  • 16,161
  • 49
  • 147
  • 202
  • 1
    Can you show us the code that searches for matches please? – JREN Jun 27 '13 at 06:58
  • @JREN : Added the searcher code – Jaanus Jun 27 '13 at 07:01
  • 1
    [*Parsing HTML is a solved problem. You do not need to solve it. You just need to be lazy.*](http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html) – Duncan Jones Jun 27 '13 at 07:02
  • 1
    If you're working with html, you should use a html parser... [stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – jlordo Jun 27 '13 at 07:03

3 Answers3

6

Change to:

<a href(.*?foo\.bar.*?)a>

It removes the greediness. And real dots should be escaped to \..

dda
  • 6,030
  • 2
  • 25
  • 34
1

Use .*? instead of .*. The greedy quantifier matches characters as many as possible, while the reluctant quantifier matches the least number of characters in a single find operation.

Besides, use foo\.bar if you intend to match a literal text of "foo.bar".

dexjq23
  • 366
  • 2
  • 6
0

Hope below code will help you:

int noOfTimefoundString = 0;
Pattern pattern = Pattern.compile("<a href=\"https://foo.bar");
Matcher matcher = pattern.matcher(body);
List<String> matches = new ArrayList<String>();
while (matcher.find()) {
  matches.add(matcher.group());
  noOfTimefoundString++;
}
Iterator matchesItr = matches.iterator();
while(matchesItr.hasNext()){
  System.out.println(matchesItr.next());
}
System.out.println("No. of times search string found = "+noOfTimefoundString);
dda
  • 6,030
  • 2
  • 25
  • 34
D-Nesh
  • 1
  • 1