3

I need a regex to find all matches for my pattern.

The text is something like this:

"someother text !style_delete [company code : 43ev4] between text !style_delete [organiztion : 0asj9] end of line text"

And I would like to find all matches for the pattern:

!style_delete [.*]

I have tried like this:

Pattern pattern = Pattern.compile("!style_delete\\s*\\[.*\\]");

With this the match text is coming like this:

!style_delete [company code : 43ev4] between text !style_delete [organiztion : 0asj9]

But I am expected as follows:

match 1 : !style_delete [company code : 43ev4] 
match 2 : !style_delete [organiztion : 0asj9]

Please help me, what will the regex in java to get above output.

Geoff
  • 5,283
  • 2
  • 17
  • 11
Ranjith
  • 41
  • 1
  • 1
  • 5
  • Could you please provide an example of text that you are trying to match? – David Sep 28 '15 at 19:09
  • You could replace the `.*` in the brackets with `[^]]*`. This will match everything inside of the square brackets instead of the right square bracket so you don't accidentally consume too much. – FriedSaucePots Sep 28 '15 at 19:11
  • To try your regexp patterns, you can always use some online regexp testers (google "online regexp" to have a few). I often use https://regex101.com – Benoît Sep 28 '15 at 19:15
  • A regular expression has to be between forward slashes like /regexp*/ – Arif Burhan Feb 27 '16 at 09:48

3 Answers3

13
@Test
public void test() {
    final String input = "someother text !style_delete [company code : 43ev4] between text !style_delete [organiztion : 0asj9] end of line text";
    // my regexp:strong text
    // final String regex = "(!style_delete\\s\\[[a-zA-Z0-9\\s:]*\\])";
    // regexp from Trinmon:
    final String regex = "(!style_delete\\s*\\[[^\\]]*\\])";

    final Matcher m = Pattern.compile(regex).matcher(input);

    final List<String> matches = new ArrayList<>();
    while (m.find()) {
        matches.add(m.group(0));
    }

    assertEquals(2, matches.size());
    assertEquals("match 1: ", matches.get(0), "!style_delete [company code : 43ev4]");
    assertEquals("match 2: ", matches.get(1), "!style_delete [organiztion : 0asj9]");
}

edit

perhaps the pattern from Trinimon's answer is a little bit more elegant. i updated the regex with the regex of him.

StefanHeimberg
  • 1,455
  • 13
  • 22
5

You need to use non-greedy matching:

start.*?end

In your case, pattern is :

!style_delete\\s\\[(.*?)\\] (Even simple to understand than first version :))

Proof (Java 7) :

String string = "someother text !style_delete [company code : 43ev4] between text !style_delete [organiztion : 0asj9] end of line text"; 
Pattern pattern = Pattern.compile("!style_delete\\s\\[(.*?)\\]");
Matcher matcher = pattern.matcher(string) ;
while (matcher.find()) {
    System.out.println(matcher.group());
}

Link to proof : http://ideone.com/Qtymb3

Guillaume
  • 466
  • 5
  • 19
3

It's because .* is greedy. Use this instead:

"!style_delete\\s*\\[[^\\]]*\\]"

It means: match everything in bracket excluding a closing ].

Or make the content between [] it non-greedy:

"!style_delete\\s*\\[.*?\\]"
Trinimon
  • 13,839
  • 9
  • 44
  • 60