16

I implemented an HTTP server in Go.

For each request, I need to create hundreds of objects for a particular struct, and I have ~10 structs like that. So after the request is finished as per Go implementation it will be garbage collected.

So for each request this much amount of memory will be allocated and deallocated.

Instead I wanted to implement memory pooling to improve performance from allocation side as well as GC side also

In the beginning of request, I will take from pool and put them back after the request is served

From the pool implementation side

  1. How to allocate and deallocate memory of a particular type of struct?
  2. How keep track of information this memory got assigned and other is not?

Any other suggestions to improve performance in case of memory allocation and deallocation?

Tej
  • 583
  • 1
  • 6
  • 21
  • 7
    There's [`sync.Pool`](https://golang.org/pkg/sync/#Pool) in the stdlib. Other than that, implementing a "free list" is a technique that's not specific to Go. – JimB Jul 21 '16 at 13:45
  • Check this out https://github.com/larytet/mcachego/tree/master/unsafepool for some ideas. – Larytet Aug 08 '19 at 18:36

2 Answers2

34

Note beforehand:

Many suggest to use sync.Pool which is a fast, good implementation for temporary objects. But note that sync.Pool does not guarantee that pooled objects are retained. Quoting from its doc:

Any item stored in the Pool may be removed automatically at any time without notification. If the Pool holds the only reference when this happens, the item might be deallocated.

So if you don't want your objects in the Pool to get garbage collected (which depending on your case might result in more allocations), the solution presented below is better, as values in the channel's buffer are not garbage collected. If your objects are really that big that memory pool is justified, the overhead of the pool-channel will be amortized.

Moreover, sync.Pool does not allow you to limit the number of pooled objects, while the presented solution below naturally does.


The simplest memory pool "implementation" is a buffered channel.

Let's say you want a memory pool of some big objects. Create a buffered channel holding pointers to values of such expensive objects, and whenever you need one, receive one from the pool (channel). When you're done using it, put it back to the pool (send on the channel). To avoid accidentally losing the objects (e.g. in case of a panic), use defer statement when putting them back.

Let's use this as the type of our big objects:

type BigObject struct {
    Id        int
    Something string
}

Creating a pool is:

pool := make(chan *BigObject, 10)

The size of the pool is simply the size of the channel's buffer.

Filling the pool with pointers of expensive objects (this is optional, see notes at the end):

for i := 0; i < cap(pool); i++ {
    bo := &BigObject{Id: i}
    pool <- bo
}

Using the pool by many goroutines:

wg := sync.WaitGroup{}
for i := 0; i < 100; i++ {
    wg.Add(1)
    go func() {
        defer wg.Done()
        bo := <-pool
        defer func() { pool <- bo }()
        fmt.Println("Using", bo.Id)
        fmt.Println("Releasing", bo.Id)
    }()
}

wg.Wait()

Try it on the Go Playground.

Note that this implementation blocks if all the "pooled" objects are in use. If you don't want this, you may use select to force creating new objects if all are in use:

var bo *BigObject
select {
case bo = <-pool: // Try to get one from the pool
default: // All in use, create a new, temporary:
    bo = &BigObject{Id:-1}
}

And in this case you don't need to put it back into the pool. Or you may choose to try to put all back into the pool if there's room in the pool, without blocking, again with select:

select {
case pool <- bo: // Try to put back into the pool
default: // Pool is full, will be garbage collected
}

Notes:

Filling the pool prior is optional. If you use select to try to get / put back values from / to the pool, the pool may initially be empty.

You have to make sure you're not leaking information between requests, e.g. make sure you don't use fields and values in your shared objects that were set and belong to other requests.

icza
  • 389,944
  • 63
  • 907
  • 827
  • 6
    Note this method is considerably slower than `sync.Pool`. – OneOfOne Jul 21 '16 at 14:15
  • 8
    @OneOfOne Yes, but `sync.Pool` does not guarantee retention of pooled values. See edited answer. – icza Jul 21 '16 at 14:31
  • How frequent GC in Golang will run? What are the factors does it depend on? – Tej Jul 22 '16 at 03:27
  • 1
    @Raghu GC is a huge topic. You may check out the package doc of [`runtime`](https://golang.org/pkg/runtime/) which details env variables which control the GC. Also read the blog post: [Go GC: Prioritizing low latency and simplicity](https://blog.golang.org/go15gc) – icza Jul 22 '16 at 07:16
  • `You have to make sure you're not leaking information between requests, e.g. make sure you don't use fields and values in your shared objects that were set and belong to other requests.` no way to wait for the object gc eviction to re add it to the pool ? can we hook that in the gc ? (does it make sense ?) –  Jul 16 '17 at 11:52
  • @mh-cbon First, gc runs concurrently with no synchronization to your code, you can't rely on it in this regard. Second, it doesn't make sense. You're using pool because you don't want the values to get gc'ed, because you want to avoid having to reallocate each time. You use a value and you put it back to the pool to be reused. On reuse, you have to make sure you don't leave values in it that belong to a different user / request. – icza Jul 18 '17 at 08:58
  • go it, i was thinking it was the best moment in time where i was certain the object was not consumed somewhere else in the code, thus good for re use. But yeah, if it entered the gc eviction phase, it might be a bit late, anyway i found my way around. –  Jul 18 '17 at 09:03
  • 1
    Also, sync.Pool doesn't provide any form of bound, say i want to pool just 5 resources and use them in different goroutines. Calling .Get() in all instances would mean it would keep creating new instances of the same object. – Subomi Jan 24 '18 at 13:46
  • From the docs `If Get would otherwise return nil and p.New is non-nil, Get returns the result of calling p.New` – Subomi Jan 24 '18 at 13:47
  • are you sure use buffer chan is much slow than sync.pool? my test show it is reverse – pete lin Jun 14 '18 at 07:46
  • @petelin Your test is flawed, which was corrected in this answer: [sync.Pool is much slower than using channel, so why should we use sync.Pool?](https://stackoverflow.com/questions/50851421/sync-pool-is-much-slower-than-using-channel-so-why-should-we-use-sync-pool/50851673#50851673) And the correct benchmark shows `sync.Pool` is faster. – icza Jun 14 '18 at 07:48
14

This is the sync.Pool implementation mentioned by @JimB. Mind the usage of defer to return object to pool.

package main

import "sync"

type Something struct {
    Name string
}

var pool = sync.Pool{
    New: func() interface{} {
        return &Something{}
    },
}

func main() {
    s := pool.Get().(*Something)
    defer pool.Put(s)
    s.Name = "hello"
    // use the object
}
Grzegorz Żur
  • 47,257
  • 14
  • 109
  • 105