4

I'm writing regexes for log files to detect events. What I'm trying to do is detect if the phrase "restart-required" appears in the logs, but the tricky part is, I want to ignore all the debug messages. Unfortunately, the logs aren't deliminated in any form, and just run together.

The good thing is, all my debug messages begin with 'Debug:' and end with 'endmsg'.

What I've been able to put together so far is a regex to capture all my debug phrases.

/Debug:\s(.+?(?=endmsg))/gm

What I can't figure out from here is how to go about extending this to search for the phrase 'restart-required' but ignore it if it's in one of these captured debug messages.

A regex101 of what I'm working with - https://regex101.com/r/zI1kM2/3

I'm not looking to capture phrases or anything around it, but just a boolean True/False to answer the question "Does the phrase 'restart-required' occur somewhere in the logs outside of debug messages?"

Thanks!

Anmol Singh Jaggi
  • 8,376
  • 4
  • 36
  • 77
Rybon
  • 43
  • 4
  • 1
    Possible duplicate of [Regex Pattern to Match, Excluding when... / Except between](http://stackoverflow.com/questions/23589174/regex-pattern-to-match-excluding-when-except-between) – 4castle May 10 '16 at 20:52

3 Answers3

2

One regex you could use is: Debug.*?endmsg|(restart-required).

This will match all of the Debug statements first, and if it doesn't match, it will then try to match the group on the right (the one with a capture group). When you are processing the matches, look for if any matches have a first capture group. If they do, then you can return true.

Regex101 Example - matches are highlighted in green

For more information on this, read The Best Regex Trick, from Rexegg.


EDIT: Looking at this answer, I also came across a way in which PCRE has something like this already built in. It is with (*SKIP) and (*F) (more information here). The modified regex would be:

Debug.*?endmsg(*SKIP)(*F)|restart-required

It does not require inspecting any capture groups, and has the output desired. If there are any matches to this regex, return true.

Regex101 Example

Community
  • 1
  • 1
4castle
  • 32,613
  • 11
  • 69
  • 106
  • 1
    @AnmolSinghJaggi Yes, I highly recommend reading that article. This trick is also the accepted answer for [this famous question](http://stackoverflow.com/questions/23589174/regex-pattern-to-match-excluding-when-except-between). – 4castle May 10 '16 at 20:55
  • I've made an update to my answer that makes the regex even easier to use. It uses a feature specific to PHP and PCRE. – 4castle May 10 '16 at 22:04
  • The (*SKIP)(*F) is brilliant! I've never seen those before but those do the trick perfectly. Thank you! – Rybon May 11 '16 at 12:40
1

Instead of writing a regex directly for the task you want, you can instead remove all the debug messages from the log and then search (with or without regex) for the string 'restart-required' in the remaining log message.

For removing the debug messages, replace the matches of the regex Debug:.*?endmsg with an empty string ''.

Anmol Singh Jaggi
  • 8,376
  • 4
  • 36
  • 77
0

Give this a try:

/(?=(^Debug:\s(.+?(?=endmsg))$))|(^.*restart-required.*$)/gm

As you stated already, the first group uses a positive lookahead to match debug messages and this first group won't be included in the result, and the second group select remaining lines which contain restart-required.

I usually use BRE and ERE available with shell commands, so this above PCRE regex should be cleaned up and tested.

There are some online pcre consoles on the web to play with: e.g. Online Regex Tester, on the page use the select box to switch to PCRE. This is very usefull to test a PCRE regex with a log file sample.

The tester above had been used with these lines:

test line 1
Debug: blablabla with endmsg
test line 2
two words restart-required
Debug: one two three with endmsg
Jay jargot
  • 2,745
  • 1
  • 11
  • 14
  • I can't get this to work at all. Can you give a working regex? – 4castle May 10 '16 at 20:43
  • @4castle I tested it with the tester web site for pcre regex mentioned in the answer. Did you test the current version? I modified 3 times during the first minutes. I need more time to use a pcre lib and test something like a real situation. The answer had been updated with log file abstract used. – Jay jargot May 10 '16 at 21:04
  • I think you might have missed the part of the question that said "the logs aren't deliminated in any form, and just run together". It's a good answer, but there aren't any line breaks in the log. Try using it with the regex101 that they supplied in the question. – 4castle May 10 '16 at 21:33
  • true, I missed that. I should say that starting with PCRE is difficult. I need more time then. – Jay jargot May 11 '16 at 06:26