1

I am trying to parse all the log line prints from a few java files and in notepad++ (grep all log lines to a new file). I created the following regex and it appears to work in notepad++ as when I find a find on finds file, it finds and highlights the full multi-part example as well as one-line examples.

(?s)^\s*LOGGER.(warn|info|error|debug|exception|trace)\(.*?\);$
'OR'
(?ms)^\s*LOGGER.(warn|info|error|debug|exception|trace)\(.*?\);$

Multi-Line example:

                logger.info(
                        "some info [{}], ID1[{}], ID2[{}], ID3[{}] some info",
                        ID.getValue(),
                        ID.getKey(), instance.getId3(), instance.getId4());

Single-line example:

        logger.info("Some Info [{}] is now some info", Id);

The issue appears to be when I do a "find all" using notepad++ or with bash using the regex, it does not actually capture the multi-line message in the output.

Notepad++ Finds and displays the following results:

 logger.info(
 logger.info("Some Info [{}] is now some info", Id);

In bash using grep, it does not capture the multi-part lines as all, only the single line.

grep -i -nr -P "(?s)^\s*LOGGER.(warn|info|error|debug|exception|trace)\(.*?\);"

The grep command above only prints one line. it does not print the first line of the multi-part line at all.

logger.info("Some Info [{}] is now some info", Id);

Update 1:

From the comments, it seems that using pcregrep works for multi-part lines.

pcregrep -n -i -M 'LOGGER.(warn|info|error|debug|exception|trace)\(.*(\n|.)*?\);' Text.java

However, when I have a logline that is above 2 lines (3 or 4 as an example) it does not capture them.

A fix for that seems to be adding another \n as a or clause but introduces an issue with single loglines.

pcregrep -n -i -M 'LOGGER.(warn|info|error|debug|exception|trace)\(.*(\n|\n.*\n|.)*?\);' Text.java

When I have a logline that is only on one line, it also captures the next line until it ends with );

90:        logger.info("starting sync [{}], correlation-id [{}]", syncId, correlationId);
    // some comments
    value.test(() -> Async(value, test4, test2, test1));

Update 2: I now have it working. Thanks everyone

pcregrep -n -i -M 'logger.(warn|info|error|debug|exception|trace)\((\n*.*?\n*)*?\);' Text.txt
rcmpayne
  • 163
  • 2
  • 15
  • I think the find function in notepad only displays the first line found for every result. You might test it when using the replace function that all lines are replaced. For grep see https://stackoverflow.com/questions/2686147/how-to-find-patterns-across-multiple-lines-using-grep – The fourth bird Jan 06 '22 at 13:00
  • Did you try `(?si)^\h*LOGGER\.(?:warn|info|error|debug|exception|trace)\(.*?\);\h*$`? – Wiktor Stribiżew Jan 06 '22 at 15:50
  • Testing pcregrep works for multi-lines but if i have a single line like `logger.info("Some Info [{}] is now some info", Id);` it will always grab the next line that ends with `);` Example Syntax: `pcregrep -n -i -M 'LOGGER.(warn|info|error|debug|exception|trace)\(.*(\n|.)*?\);' Text.java` – rcmpayne Jan 06 '22 at 16:05
  • appears i have it working now `pcregrep -n -i -M 'logger.(warn|info|error|debug|exception|trace)\((\n*.*?\n*)*?\);' DirectorySyncService.java` – rcmpayne Jan 06 '22 at 16:58
  • `\n*.*?\n*)*?` is too convoluted. Use `(?s:.)*?`. And escape the dot, `.` matches any char. – Wiktor Stribiżew Jan 06 '22 at 17:32

1 Answers1

0

You can use

pcregrep -n -i -M '(?s)logger\.(warn|info|error|debug|exception|trace)\(.*?\);' DirectorySyncService.java

Details:

  • -n - precedes each output line by its line number in the file, followed by a colon for matching lines or a hyphen for context lines. If the filename is also being output, it precedes the line number.
  • -i - ignores upper/lower case distinctions during comparisons
  • -M - allows patterns to match more than one line
  • (?s)logger\.(warn|info|error|debug|exception|trace)\(.*?\); matches
    • (?s) - a singleline modifier that allows . to match line break chars
    • logger\. - logger. string
    • (warn|info|error|debug|exception|trace) - Group 1: one of the words inside the group
    • \( - a ( char
    • .*? - zero or more chars as few as possible
    • \); - a ); string.

In Notepad++, you could use

(?si)logger\.(warn|info|error|debug|exception|trace)\(.*?\);
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563