0

I've noticed that when I time JSON unmarshalling in the context of a Go HTTP server, it takes 30,000+ nanoseconds even for small objects. This seemed large to me so I ran some isolated benchmarks, which surprisingly showed an average time of ~500 nanos per unmarshal. To dig into this more, I wrote a program that just does a series of unmarshals on the same small object, which showed that the first unmarshal is slow, and subsequent ones are much faster:

package main

import (
    "time"
    "fmt"
    "encoding/json"
)

var b []byte

type Dog struct {
    Age int `json:"Age"`
}

func unmarshalDog() {
    dogCopy := Dog{}

    start := time.Now().UnixNano()
    json.Unmarshal(b, &dogCopy)
    end := time.Now().UnixNano()
    fmt.Printf("Time to marshal/unmarshal: %d\n", end-start)
}

func main() {
    // Marshal an object into a byte array which we will repeatedly unmarshal from
    d := Dog {
        Age: 5,
    }

    var err error
    b, err = json.Marshal(d)
    if err != nil {
        panic(err)
    }

    for i := 0; i < 20; i++ {
        unmarshalDog()
    }

    // Allow the goroutines to finish before terminating execution.
    time.Sleep(3 * time.Second)
}

Output:

$ go run testJSONPerformance.go
Time to marshal/unmarshal: 34127
Time to marshal/unmarshal: 1465
Time to marshal/unmarshal: 979
Time to marshal/unmarshal: 892
Time to marshal/unmarshal: 849
Time to marshal/unmarshal: 814
Time to marshal/unmarshal: 822
Time to marshal/unmarshal: 822
Time to marshal/unmarshal: 815
Time to marshal/unmarshal: 829
Time to marshal/unmarshal: 822
Time to marshal/unmarshal: 819

More interestingly, when I ran the same program but ran each call to unmarshalDog() in a separate goroutine via go unmarshalDog(), the warming up phenomenon disappeared and the average unmarshal time was much higher:

Time to marshal/unmarshal: 36540
Time to marshal/unmarshal: 4652
Time to marshal/unmarshal: 56959
Time to marshal/unmarshal: 3887
Time to marshal/unmarshal: 57068
Time to marshal/unmarshal: 3519
Time to marshal/unmarshal: 37160

I also tried a version where instead of unmarshaling the same byte array, I marshaled and unmarshaled a different object each time (in case there was some kind of runtime caching going on). The results were the same in this case.

I'm very curious to understand what's going on with this apparent "warming up." In the context of an HTTP server, each request gets a different goroutine so each unmarshal is on average quite slow. This seems suboptimal given that apparently Go has the potential to do the unmarshal in 1/50th of the time in a certain context. All intuition appreciated!

rampatowl
  • 1,722
  • 1
  • 17
  • 38
  • 2
    See [Order of the code and performance](https://stackoverflow.com/questions/41608578/order-of-the-code-and-performance/41608707#41608707). – icza Aug 23 '18 at 16:46
  • Thanks for the reply. Do you know why this effect disappears when the code is reran in separate goroutines? This is the context in an actual HTTP server, so the benchmark times are not representative of real performance. – rampatowl Aug 23 '18 at 16:50
  • Ah, interesting! Is there a way to get Go to share this cache across goroutines (eg in an HTTP server)? I can't find any reference material about the JSON unmarshal cache on Google. – rampatowl Aug 23 '18 at 19:03
  • The code is one click away from the documentation. Here's the [code related to the cache](https://github.com/golang/go/blob/6e76aeba0bda33f6bd45ac9c8e5c026c1688e846/src/encoding/json/encode.go#L1266-L1272). – Charlie Tumahai Aug 23 '18 at 23:29
  • @ThunderCat, consider converting your set of comments into an answer, as they indeed answer the question. – kostix Aug 24 '18 at 09:32
  • A nitpick: do not use sleeping to wait for goroutines: not only is it flat out lame but the Go memory model does not in any way consider waiting-by-wall-clock to play any role in the synchronization between goroutines. There are different ways to do that, but the `sync.WaitGroup` is supposedly the simplest one (in a case like yours). – kostix Aug 24 '18 at 09:35

1 Answers1

2

The JSON unmarshaller caches information on first unmarshal of any type. Subsequent unmarshal for the type use this cached information.

The cache is shared across goroutines, but there's no code to ensure that only one goroutine attempts to create the first cached value. Once one goroutine stores a value to the cache, all new comers will use that cached value.

The code for the cache is here.

Charlie Tumahai
  • 113,709
  • 12
  • 249
  • 242