1

This question is different from the "normal" parse new lines in a string because this question uses three different regular expression strings to parse a single line of text... which may or may not have multiple lines

I have the following log entries and would like to have a regular expression that parses both of them... currently it only parses the first line.

Here are the log entries

NOTE: the second log entry has a hard return ("\r\n") in it

[2018-05-25 08:23:54.6040][Manager.Calls.Manager][GetManagerID]
[2018-05-25 08:23:54.6040][Manager.Calls.Manager][Status as of 5/25/2018 8:23:54 AM
Expires 1/1/0001 12:00:00 AM]

Here is the regular expression that I am currently using:

(\[\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{4}\])\[(.*?)\]\[(.*?)\]

I think I need something to get the "\r\n" line endings...

Here is how I am calling Regex

match = Regex.Match(logEntryText, pattern, RegexOptions.Compiled | RegexOptions.Multiline);

More info:

I tried this for the last reg expression

\[(.*?)(\n|\r|\r\n)\]

but it too fails

MrLister
  • 634
  • 7
  • 32
  • What programming language or tool are you using? – melpomene May 25 '18 at 18:31
  • .net 4.6 and C# – MrLister May 25 '18 at 18:48
  • 1
    `RegexOptions.Multiline` does nothing there (it only changes the meaning of `^` and `$`, which you're not using). Try `RegexOptions.Singleline` instead (which changes the meaning of `.`). [Reference](https://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regexoptions(v=vs.110).aspx). – melpomene May 25 '18 at 18:54

1 Answers1

0

I think that you could benefit from some preprocessing for your data before performing the parsing operation. For example, check if the log line ends with ], and if it doesn't, append the next line to it; then process the resulting complete log entry.

Edit: I was assuming that you have just a list of lines (e.g. from a text file). If what you have instead is a list of log entries which sometimes include the hard return, then just remove it before processing: logEntry.Replace("\r\n","").

Konamiman
  • 49,681
  • 17
  • 108
  • 138
  • I don't believe that will solve the problem... the line feeds will still be in the line ... won't regex fail then? – MrLister May 25 '18 at 19:06
  • Well, the idea is to concatenate the two lines so that the extra line feed is no more. e.g. `[foo][bar][fizz\r\nbuzz]` becomes `[foo][bar][fizzbuzz]` – Konamiman May 25 '18 at 19:09
  • Then wouldn't I have to remember which line had (and where it was in the line) in order to display it correctly back to the user after the regex is run? – MrLister May 25 '18 at 19:26
  • Nothing prevents you from keeping the original log entry. Use the concatenated one just to parse the data and then discard it. Are you parsing a text file/collection of lines, or a collection of log entries, to start with? – Konamiman May 25 '18 at 19:37