Order of the code and performance

Question

I wanted to find which is faster: struct vs array. So I wrote a GO code in which I write 4 int values (1,2,3 and 4) to the members of a structure and then to an array of length 4. I tried to find the time it takes to write.

Case1: First, I write values to a structure and then to an array. Here I found array to be faster than structure.

package main

import (
    "fmt"
    "time"
)

type abc struct {
    a, b, c, d int
}

func main() {

    var obj abc

    t1 := time.Now()
    obj.a = 1
    obj.b = 2
    obj.c = 3
    obj.d = 4
    t2 := time.Since(t1)

    fmt.Println("Struct access time: : ", t2)

    a := make([]int, 4)
    t3 := time.Now()
    a[0] = 1
    a[1] = 2
    a[2] = 3
    a[3] = 4
    t4 := time.Since(t3)

    fmt.Println("Array access time: : ", t4)

}

Case2: Second, I write values to an array and then a structure. Here I found structure to be faster than array.

package main

import (
    "fmt"
    "time"
)

type abc struct {
    a, b, c, d int
}

func main() {

    var obj abc

    a := make([]int, 4)
    t3 := time.Now()
    a[0] = 1
    a[1] = 2
    a[2] = 3
    a[3] = 4
    t4 := time.Since(t3)

    fmt.Println("Array access time: : ", t4)

    t1 := time.Now()
    obj.a = 1
    obj.b = 2
    obj.c = 3
    obj.d = 4
    t2 := time.Since(t1)

    fmt.Println("Struct access time: : ", t2)

}

Why the performance depends on to what I write first? The one that I write to first appears to be slower. Why is it so?

icza · Answer 1 · 2018-11-19T15:02:19.810

13

Running any code for the first time may have some (significant) overhead, e.g. related code may be loaded, many things may be deferred until they are needed (e.g. internal buffers). Running the same thing again may take significantly less time, the difference may even be several orders of magnitude.

Whenever you want to measure execution times, you should run it many times, measure the execution time of the multiple runs, and calculate average time. It's also a good idea to exclude the first (some) runs from the calculation for the above mentioned reasons.

In Go, best and easiest is to use test files and benchmark functions. Read the package doc of testing for more details and examples.

Your case can be benchmarked like this:

package main

import "testing"

type abc struct {
    a, b, c, d int
}

func BenchmarkSlice(b *testing.B) {
    a := make([]int, 4)
    for i := 0; i < b.N; i++ {
        a[0] = 1
        a[1] = 2
        a[2] = 3
        a[3] = 4
    }
}

func BenchmarkStruct(b *testing.B) {
    a := abc{}
    for i := 0; i < b.N; i++ {
        a.a = 1
        a.b = 2
        a.c = 3
        a.d = 4
    }
}

Save it to a file like something_test.go, run it with go test -bench .. Output:

BenchmarkSlice-4        2000000000           1.24 ns/op
BenchmarkStruct-4       2000000000           0.31 ns/op

You can see that using a struct is roughly 4 times faster. You will get similar (very close) results if you reorder the benchmark functions.

edited Nov 19 '18 at 15:02

answered Jan 12 '17 at 08:52

icza

389,944
63
907
827

1

in your bench test, the `struct` is faster than a `slice`. In the conclusion, you wrote the opposite. – Motakjuq Jan 12 '17 at 09:04
@icza I understand what you said. I expected both code to run the same. Won't all the instructions (both array and struct) be like STR
. So both must have equal execution time
– Jsmith Jan 12 '17 at 09:12
1

@Jsmith The slice version also includes dealing with index values, and potentially boundary check, while the struct version does not. – icza Jan 12 '17 at 09:13
2

I just read the generated code. Bounds checks are elided like they should be since the compiler knows exactly how big the slice is. What's actually happening is that the writes to the struct are completely eliminated, by, I assume, dead store elimination while the writes to the slice aren't. Making the struct global disables dead store elimination and both perform exactly the same. – Art Jan 12 '17 at 12:39
@Art I guess your answer is correct. Can you please add it as answer? – Jsmith Jan 24 '17 at 13:16

score 1 · Accepted Answer · answered Jan 25 '17 at 09:01

The other answer explained the timing difference, let's get into struct vs. slice.

If the compiler can figure out at compile time that the slice is big enough, accessing the elements of the slice and a struct will generate identical code. Of course, in reality, often the compiler won't know how big the slice is and completely different optimizations will be applied depending on if you're working with a struct or a slice, so for measuring performance you have to look at a whole program and its behavior, not just one particular operation.

Order of the code and performance

2 Answers2

Linked