A faster or slower way to clear truncated pointers?

Question

In a truncate implementation I've read recently, the author uses the following way to clear the truncated items:

var nilItems    = make(items, 16)

func (s *items) truncate(index int) {
    var toClear items
    *s, toClear = (*s)[:index], (*s)[index:]
    for len(toClear) > 0 {
        toClear = toClear[copy(toClear, nilItems):]
    }
}

When I need to clear unwanted items, I would just iterate over the slice and set items to nil one by one.

I have set up a simple benchmark and it seems that the for loop way is faster.

I wonder what's the benefit of clearing with copy in bulk.

I see the benchmark shows the other way round. Copy based clear is faster than nil by loop. — Kishore Bandi, May 18 '20 at 12:42
The `nil` by loop can utilise `runtime.memclrHasPointers` on some platforms, e.g. `amd64` which is a `memclr` specialisation and is going to be hard to beat. Note for value types it uses `runtime.memclrNoHeapPointers` (e.g. `[]int`). — Martin Gallagher, May 18 '20 at 12:53

icza · Accepted Answer · 2020-05-18T13:12:35.053

As mentioned by @MartinGallagher, your loop is recognized and optimized by the compiler, while the copy() version does "too much stuff" and is not optimized.

If you change your examples to fill with a non-nil pointer value, you'll see the loop version falls behind. Also don't allocate (make()) inside the benchmark loop, do that outside, and use b.ResetTimer() to exclude that time.

You also have a very small slice, if you increase its size, the difference will be more noticable:

var x = new(int)

func BenchmarkSetNilOneByOne(b *testing.B) {
    nums := make([]*int, 12800)
    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        for i := range nums {
            nums[i] = x
        }
    }
}

func BenchmarkSetNilInBulk(b *testing.B) {
    nils := make([]*int, 128)
    for i := range nils {
        nils[i] = x
    }

    orig := make([]*int, 12800)
    var nums []*int
    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        nums = orig
        for len(nums) > 0 {
            nums = nums[copy(nums, nils):]
        }
    }
}

Benchmark results:

BenchmarkSetNilOneByOne-4          96571         10626 ns/op
BenchmarkSetNilInBulk-4           266690          4023 ns/op

Also note that your "bulk" version also assigns slice headers to nums several times. There is a faster way to fill the slice: you do not need an additional "nils" slice, just start filling your slice, and you may copy the already filled part to the unfilled part. This also doesn't require to change / reassign to the nums slice header. See Is there analog of memset in go?

So if we want to set `nil`s, nothing beats the `for` loop that's optimized by the compiler, right? — satoru, May 19 '20 at 10:02
@satoru It seems so currently. You could still try the solution presented in the [Is there analog of memset in go?](https://stackoverflow.com/questions/30614165/is-there-analog-of-memset-in-go/30614594#30614594) answer. — icza, May 19 '20 at 10:08
Thanks. I've tried the algorithm used in `bytes.Repeat` and it's about twice faster as shown in your answer. And I've confirmed that it's faster to use the optimized `for` loop when the assignment is `a[i] = nil` (it doesn't work if it's a variable of the value `nil`). — satoru, May 19 '20 at 11:41

A faster or slower way to clear truncated pointers?

1 Answers1