Check if key exists in map storing large values

Question

To know a key k exist in a map M1[k]v is very straightforward in Go.

if v, ok := M1[k]; ok {
    // key exist
}

'v': a value of a non-pointer type.

If v is large, To just check if a particular key exists using the above method is not efficient as it will load the value v in the memory(even if I use a blank identifier _ in the place of v as per my understanding and please correct me if my understanding is wrong here).

Is there an efficient way in which we can check if a key is present in a Map(without reading/or the value is allocated in the memory)?

I am thinking to create a new map M2[k]bool to store the information and make an entry in M2 each time I insert something in M1.

Use `if _, ok := M1[k]; ok { }`. If you use the blank identifier, the value will not be "loaded". — icza, Jan 02 '22 at 17:03
*if a particular key exists using the above method is not efficient as it will load the value v in the memory [...]* What makes you believe that? — jub0bs, Jan 02 '22 at 17:04
See [mapaccess2_fat](https://github.com/golang/go/blob/master/src/runtime/map.go#L569) and friends. The runtime returns a pointer to the map value. The pointed to value is only copied if used by the application. — , Jan 02 '22 at 17:08

icza · Accepted Answer · 2022-01-02T18:04:50.690

7

Use if _, ok := M1[k]; ok { }. If you use the blank identifier, the value will not be "loaded".

Let's write benchmarks to test it:

var m = map[int][1_000_000]int64{
    1: {},
}

func BenchmarkNonBlank(b *testing.B) {
    for i := 0; i < b.N; i++ {
        if v, ok := m[1]; ok {
            if false {
                _ = v
            }
        }
    }
}

func BenchmarkBlank(b *testing.B) {
    for i := 0; i < b.N; i++ {
        if _, ok := m[1]; ok {
            if false {
                _ = ok
            }
        }
    }
}

Running go test -bench ., the output is:

BenchmarkNonBlank-8         1497            763278 ns/op
BenchmarkBlank-8        97802791                12.09 ns/op

As you can see, using the blank identifier, the operation takes about 10 ns. When we assign the value to a non-blank identifier, it's almost 1 ms (almost a hundred thousand times slower) when the value type has a size of around 8 MB.

edited Jan 02 '22 at 18:04

answered Jan 02 '22 at 17:07

icza

389,944
63
907
827

The behaviour is same for `sync.Map.Load()` as well?. I went through the official docs https://pkg.go.dev/sync#Map.Load and it says nothing about allocation in case we use a blank identifier. – Prakash P Jan 02 '22 at 17:11
@PrakashPandey That's a completely different story, as `sync.Map` is not generic (like the builtin map type), and hence stored values must be wrapped in `interface{}`. Also `Map.Load()` is a function call, it must return the values regardless if you use them or not. My answer shows how to benchmark it, modify my benchmark to test it yourself. – icza Jan 02 '22 at 17:16
I think I got my answer. as `map` is built in and `sync. Map` is just a struct + `Map.Load()` is a normal `func` call so it will always allocative the value in the memory. And thanks for sharing the above benchmark code. – Prakash P Jan 02 '22 at 17:32
1

@PrakashPandey The call `sync.Map.Load()` copies an `interface{}`. An `interface{}` value is two machine words. – Jan 02 '22 at 17:35
@PenélopeStevens Thanks. If someone wants to know more about `interface{} value two machine words`, here is a nice explanation in another StackOverflow post: https://stackoverflow.com/a/23148998/4408364 – Prakash P Jan 02 '22 at 18:06

Check if key exists in map storing large values

1 Answers1