I'm trying to parse log entries in a C# app using this regex: (^[0-9]{4}(-[0-9]{2}){2}([^|]+\|){3})(?!\1)
for logs in a format like [date (in some format)] | [level] | [appname] | [message].
Where (I think):
^
matches the begin of a line (enabled /gm on regex101)[0-9]{4}(-[0-9]{2}){2}
followed by the begin of the date like 2015-03-03([^|]+\|){3})
followed by the rest of the date, the log level and app name(?!\1)
followed by not the start of a new log entry (should be the message)
For example, I have the following 4 log entries (separated by a newline for clarification):
2015-03-03 19:30:47.2725|INFO|MyApp|This is a single line log message. 2015-03-03 19:31:29.1209|INFO|MyApp|This log message has multiple lines with 2015-03-03 a date in it. 2015-03-03 19:32:50.1106|INFO|MyApp|This log message has multiple lines but just text only. 2015-03-03 19:33:20.2683|ERROR|MyApp|This log message has multiple lines but also some confusing text like 2015-03-03 19:33:20.2683|ERROR| which should still be a valid log message.
But the regex does not capture the message when I test it on regex101, probably because I don't understand how to capture the negative lookahead.
If I include .*
in the regex:
(^[0-9]{4}(-[0-9]{2}){2}([^|]+\|){3}).*(?!\1)
it matches the message but only a single line (because .
does not match a newline).
So how can I capture the (multiline) message?