0

I want to print the complete line from my log file to the user for every line containing WARN or ERROR (case insensitive).

Given this:

[01-17|18:53:38.179] INFO server/server.go:381 this would be skipped
[01-17|18:53:38.280] INFO server/server.go:620 this also
[01-17|18:53:41.180] WARN server/server.go:388 Something is warned, so show this
[01-17|18:53:41.394] WARN server/server.go:188 Something reported an ->error<-
[01-17|18:53:41.395] ERROR server/server.go:191 Blabla
[01-17|18:53:41.395] DEBUG server/server.go:196 Obviously skipped
[01-17|18:53:41.395] DEBUG server/server.go:196 This debug contains an ->error<- so match this
[01-17|18:53:41.395] WARN server/server.go:198 You get the idea

I want:

[01-17|18:53:41.180] WARN server/server.go:388 Something is warned, so show this
[01-17|18:53:41.394] WARN server/server.go:188 Something reported an ->error<-
[01-17|18:53:41.395] ERROR server/server.go:191 Blabla
[01-17|18:53:41.395] DEBUG server/server.go:196 This debug contains an ->error<- so match this
[01-17|18:53:41.395] WARN server/server.go:198 You get the idea

I naively started with

errorRegEx := regexp.MustCompile(`(?is)error|warn`)

Which would just print (from a different run, might not exactly match the above example)

WARN
error

Then I thought I'd change this to match a bit more:

errorRegEx := regexp.MustCompile(`(?is).*error.*|.*warn.*`)

But this didn't print anything at all

How can I get the complete line, and all lines, where either WARN or ERROR (case insensitive) would match?

PS: This is NOT the same question as the suggested Regex match line containing string , as this is for the go language specifically which appears to not be using the exact same standard engine.

Zach Young
  • 10,137
  • 4
  • 32
  • 53
transient_loop
  • 5,984
  • 15
  • 58
  • 117

2 Answers2

1

Taking into account the question has since been marked a dupe, and OP's comment below.

This question was flagged as a duplicate, and that linked post has a number of answers which we can use to try and piece together to make the answer to OP's question, but still not completely because those answers seem tied to PCRE and Go uses RE2.

var logs = `
[01-17|18:53:38.179] INFO server/server.go:381 this would be skipped
[01-17|18:53:38.280] INFO server/server.go:620 this also
[01-17|18:53:41.180] Warn server/server.go:388 Something is warned, so show this
[01-17|18:53:41.394] warn server/server.go:188 Something reported an ->error<-
[01-17|18:53:41.395] Error server/server.go:191 Blabla
[01-17|18:53:41.395] DEBUG server/server.go:196 Obviously skipped
[01-17|18:53:41.395] DEBUG server/server.go:196 This debug contains an ->error<- so match this
[01-17|18:53:41.395] WARN server/server.go:198 You get the idea
`

func init() {
    logs = strings.TrimSpace(logs)
}

First off, I don't understand why this didn't print anything for OP:

Then I thought I'd change this to match a bit more:

errorRegEx := regexp.MustCompile(`(?is).*error.*|.*warn.*`)

But this didn't print anything at all

because that should have printed everything:

fmt.Println("Original regexp:")
reOriginal := regexp.MustCompile(`(?is).*error.*|.*warn.*`)
lines := reOriginal.FindAllString(logs, -1)

fmt.Println("match\t\tentry")
fmt.Println("=====\t\t=====")
for i, line := range lines {
    fmt.Printf("%d\t\t%q\n", i+1, line)
}
Original regexp:
match           entry
=====           =====
1               "[01-17|18:53:38.179] INFO server/server.go:381 this would be skipped\n[01-17|18:53:38.280] INFO server/server.go:620 this also\n[01-17|18:53:41.180] Warn server/server.go:388 Something is warned, so show this\n[01-17|18:53:41.394] warn server/server.go:188 Something reported an ->error<-\n[01-17|18:53:41.395] Error server/server.go:191 Blabla\n[01-17|18:53:41.395] DEBUG server/server.go:196 Obviously skipped\n[01-17|18:53:41.395] DEBUG server/server.go:196 This debug contains an ->error<- so match this\n[01-17|18:53:41.395] WARN server/server.go:198 You get the idea"

The s flag in (?is)... means to match newline against the dot (.)^1, and because your stars (*) are greedy^2, they will match everything in the entire string if either "error" or "warn" are found.

The real solution is just to not match "\n" with the dot—get rid of the s flag and you get what you were aiming for:

fmt.Println("Whole text:")
reWholeText := regexp.MustCompile(`(?i).*error.*|.*warn.*`)
lines = reWholeText.FindAllString(logs, -1)

fmt.Println("match\t\tentry")
fmt.Println("=====\t\t=====")
for i, line := range lines {
    fmt.Printf("%d\t\t%q\n", i+1, line)
}
Whole text:
match           entry
=====           =====
1               "[01-17|18:53:41.180] Warn server/server.go:388 Something is warned, so show this"
2               "[01-17|18:53:41.394] warn server/server.go:188 Something reported an ->error<-"
3               "[01-17|18:53:41.395] Error server/server.go:191 Blabla"
4               "[01-17|18:53:41.395] DEBUG server/server.go:196 This debug contains an ->error<- so match this"
5               "[01-17|18:53:41.395] WARN server/server.go:198 You get the idea"

Now we're matching between instances of "\n" (effectively lines), and because we're using the All form which only finds non overlapping matches:

If 'All' is present, the routine matches successive non-overlapping matches of the entire expression.^3

we get complete and distinct lines.

You could tighten that regexp up a bit:

`(?i).*(?:error|warn).*` // "anything before either "error" or "warn" and anything after (for a line)"

(?:...) is a non-capturing group^1 because you don't appear care about the individual instances of "error" or "warn" in each match.

And, I still want to show that splitting by line before trying to match gives you more control/precision, and makes the regexp very easy to reason about:

r := strings.NewReader(logs)
scanner := bufio.NewScanner(r)

fmt.Println("Line-by-line:")
reLine := regexp.MustCompile(`(?i)error|warn`)

fmt.Println("match\tline\tentry")
fmt.Println("=====\t====\t=====")

var matchNo, lineNo, match = 1, 1, ""
for scanner.Scan() {
    line := scanner.Text()
    match = reLine.FindString(line)
    if match != "" {
        fmt.Printf("%d\t%d\t%q\n", matchNo, lineNo, line)
        matchNo++
    }
    lineNo++
}
Line-by-line:
match   line    entry
=====   ====    =====
1       3       "[01-17|18:53:41.180] Warn server/server.go:388 Something is warned, so show this"
2       4       "[01-17|18:53:41.394] warn server/server.go:188 Something reported an ->error<-"
3       5       "[01-17|18:53:41.395] Error server/server.go:191 Blabla"
4       7       "[01-17|18:53:41.395] DEBUG server/server.go:196 This debug contains an ->error<- so match this"
5       8       "[01-17|18:53:41.395] WARN server/server.go:198 You get the idea"

All three examples are in this Playground.

Zach Young
  • 10,137
  • 4
  • 32
  • 53
  • 1
    Actually your solution seems to largely work! I edited it and have it finally like this: ``` regexp.MustCompile(`(?im)(^.*error.*|.*warn.*)`) ```. This seems to print the whole line, at least for me. If you'd like to change your answer to have it the way it just worked for me, I will accept that answer (if you care). Unless I am missing something obvious/bad with my solution? – transient_loop Jan 18 '23 at 15:58
  • @transient_loop, thank you pointing that out. I've heavily modified my answer, which takes your discovering into account. Cheers! – Zach Young Jan 18 '23 at 20:57
  • Thanks for your elaborate modification! I don't know why in the first example "it should have printed everything", I got nothing. Maybe some contextual issue (run log was different? No idea). – transient_loop Jan 19 '23 at 13:57
-2

Look for the ERROR and WARN tokens after the first space on the line:

 errorRegEx := regexp.MustCompile(`^[^ ]* (?:ERROR|WARN) .*`)
  • 2
    That doesn't work for a block of lines like OP is asking for. It also doesn't even work for individual lines for two reasons: 1) it's case sensitive; 2) it requires a space before and after the tokens, and OP's example logs have tokens not surrounded by spaces. This log line exemplifies both criteria: `DEBUG server/server.go:196 This debug contains an ->error<- so match this`. Please update your answer, or consider removing it. – Zach Young Jan 18 '23 at 21:12