1

I'm trying to capture group of lines from large number of lines(upto 100 to 130) after a specific term.

here is my code.

String inp = "Welcome!\n"
                +" Welcome to the Apache ActiveMQ Console of localhost (ID:InternetXXX022-45298-5447895412354475-2:9) \n"
                +"  You can find more information about Apache ActiveMQ on the Apache ActiveMQ Site \n"
                +" Broker\n"
                +" Name localhost\n"
                +" Version  5.13.3\n"
                +" ID   ID:InternetXXX022-45298-5447895412354475-2:9\n"
                +" Uptime   14 days 14 hours\n"
                +" Store percent used   19\n"
                +" Memory percent used  0\n"
                +" Temp percent used    0\n"
                + "Queue Views\n"
                + "Graph\n"
                + "Topic Views\n"
                + "  \n"
                + "Subscribers Views\n";
        Pattern rgx = Pattern.compile("(?<=Broker)\n((?:.*\n){1,7})", Pattern.DOTALL);
        Matcher mtch = rgx.matcher(inp);
        if (mtch.find()) {
            String result = mtch.group();
            System.out.println(result);
        }

I want to capture below lines from above mentioned all lines in inp.

Name    localhost\n
Version 5.13.3\n
ID  ID:InternetXXX022-45298-5447895412354475-2:9\n
Uptime  14 days 14 hours\n
Store percent used  19\n
Memory percent used 0\n
Temp percent used   0\n

But my code giving me all lines after "Broker". May I know please what am doing wrong ?

Secondly, I want to understand, ?: means non capturing group but still why my regex((?:.*\n)) able to capture lines after Broker ?

Elena
  • 181
  • 1
  • 11

1 Answers1

2

You must remove Pattern.DOTALL since it makes . match newlines, too, and you grab the whole text with .* and the limiting quantifier is needless then.

Besides, your real data seems to contain CRLF line endings, so it is more convenient to use \R rather than \n to match line breaks. Else, you may use a Pattern.UNIX_LINES modifier (or its embedded flag equivalent, (?d), inside the pattern) and then you may keep your pattern as is (since only \n, LF, will be considered a line break and . will match carriage returns, CRs).

Also, I suggest trimming the result.

Use

Pattern rgx = Pattern.compile("(?<=Broker)\\R((?:.*\\R){1,7})");
// Or, 
// Pattern rgx = Pattern.compile("(?d)(?<=Broker)\n((?:.*\n){1,7})");
Matcher mtch = rgx.matcher(inp);
if (mtch.find()) {
    String result = mtch.group();
    System.out.println(result.trim());
}

See the Java demo online.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563