3

I'm looking to parse the following string into a map[string]string using a regular expression:

time="2017-05-30T19:02:08-05:00" level=info msg="some log message" app=sample size=10

I'm trying to create a map that would have

m["time"] = "2017-05-30T19:02:08-05:00"
m["level"] = "info"

etc

I have tried using regex.FindAllStringIndex but can't quite come up with an appropriate regex? Is this the correct way to go?

Xeaz
  • 340
  • 4
  • 13
  • 2
    Use a real parser instead. – Jan May 31 '17 at 06:28
  • Possible duplicate of [RegEx match open tags except XHTML self-contained tags](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – Jan May 31 '17 at 06:31

3 Answers3

5

This is not using regex but is just an example of how to achieve the same by using strings.FieldsFunc.

https://play.golang.org/p/rr6U8xTJZT

package main

import (
    "fmt"
    "strings"
    "unicode"
)

const foo = `time="2017-05-30T19:02:08-05:00" level=info msg="some log message" app=sample size=10`

func main() {
    lastQuote := rune(0)
    f := func(c rune) bool {
        switch {
        case c == lastQuote:
            lastQuote = rune(0)
            return false
        case lastQuote != rune(0):
            return false
        case unicode.In(c, unicode.Quotation_Mark):
            lastQuote = c
            return false
        default:
            return unicode.IsSpace(c)

        }
    }

    // splitting string by space but considering quoted section
    items := strings.FieldsFunc(foo, f)

    // create and fill the map
    m := make(map[string]string)
    for _, item := range items {
        x := strings.Split(item, "=")
        m[x[0]] = x[1]
    }

    // print the map
    for k, v := range m {
        fmt.Printf("%s: %s\n", k, v)
    }
}
nbari
  • 25,603
  • 10
  • 76
  • 131
  • This is what I ended up using but @kennytm 's solution is just as good provided you're fine with taking an additional dependency. – Xeaz Jun 01 '17 at 01:12
4

Instead of writing regex of your own, you could simply use the github.com/kr/logfmt package.

Package implements the decoding of logfmt key-value pairs.

Example logfmt message:

foo=bar a=14 baz="hello kitty" cool%story=bro f %^asdf

Example result in JSON:

{ 
    "foo": "bar", 
    "a": 14, 
    "baz": "hello kitty", 
    "cool%story": "bro", 
    "f": true, 
    "%^asdf": true 
}
kennytm
  • 510,854
  • 105
  • 1,084
  • 1,005
2

Use named capturing groups in your regular expression and the FindStringSubmatch and SubexpNames functions. E.g.:

s := `time="2017-05-30T19:02:08-05:00" level=info msg="some log message" app=sample size=10`
re := regexp.MustCompile(`time="(?P<time>.*?)"\slevel=(?P<level>.*?)\s`)
values := re.FindStringSubmatch(s)
keys := re.SubexpNames()

// create map
d := make(map[string]string)
for i := 1; i < len(keys); i++ {
    d[keys[i]] = values[i]
}
fmt.Println(d)
// OUTPUT: map[time:2017-05-30T19:02:08-05:00 level:info]

values is a list containing all submatches. The first submatch is the whole expression that matches the regexp, followed by a submatch for each capturing group.

You can wrap the code into a function if you need this more frequently (i.e. if you need something like pythons match.groupdict):

package main

import (
    "fmt"
    "regexp"
)

func groupmap(s string, r *regexp.Regexp) map[string]string {
    values := r.FindStringSubmatch(s)
    keys := r.SubexpNames()

    // create map
    d := make(map[string]string)
    for i := 1; i < len(keys); i++ {
        d[keys[i]] = values[i]
    }

    return d
}

func main() {
    s := `time="2017-05-30T19:02:08-05:00" level=info msg="some log message" app=sample size=10`
    re := regexp.MustCompile(`time="(?P<time>.*?)"\slevel=(?P<level>.*?)\s`)

    fmt.Println(groupmap(s, re))
    // OUTPUT: map[time:2017-05-30T19:02:08-05:00 level:info]
}
tivio
  • 169
  • 1
  • 5