why a uint32_t vs uint64_t speed difference?

Question

Trying to understand something about how g++/cpu processes integers at runtime.

I'm measuring how long the following function takes to run:

template<class T>
void speedTest() {
    for(T d=0;d<4294967295u;d++)int number;
}

This simple method will run a dumb loop the max value of uint32_t many times

and when I call:

speedTest<uint32_t>();

the software takes an average of 8.15 seconds but when I call:

speedTest<uint64_t>();

the software takes an average of 10.35 seconds.

Why is this happening?

Incrementing a 64-bit variable in 32-bit code takes more work. Especially when you profile code that hasn't been optimized, a fairly pointless exercise. — Hans Passant, Nov 17 '13 at 14:19
It's not pointless if I learn something. Something I didn't take into account was that the compilation of the software is 32bit which I recognize right away is an issue. I'm also very new to optimizations with g++, zch can you elaborate or point me in the right direction? — Parad0x13, Nov 17 '13 at 14:24

Sam · Answer 1 · 2013-11-17T14:30:46.677

3

Some possible reasons:

Larger data types require more memory bandwidth in general
Even if that loop counter is kept inside a register, the CPU is probably taking more time to do calculations with large values, especially, if it needs multiple registers (e.g. if your CPU has just 32bit wide registers)
The compiler would need to emit extra machine instructions to emulate any type not directly supported by the CPU
It also depends on optimization. Such a loop without side effects could be optimized out completely, regardless of int number; (could just be for(T d=0;d<4294967295u;d++);)

You could continue your investigation/exercise by providing some assembly.

edited Nov 17 '13 at 14:30

answered Nov 17 '13 at 14:22

Sam

7,778
1
23
49

I failed to recognize that the g++ compilation will be 32bit regardless of the platform I made it on. I don't know how to specify a 64bit build yet (still learning gcc/g++) – Parad0x13 Nov 17 '13 at 14:25
But I do have a question about your first bullet. Correct me if I'm wrong but if this were compiled to operate as a 64bit application it wouldn't matter if the variable were uint32_t or uint64_t since either way the bandwidth passed around will always be 64bit – Parad0x13 Nov 17 '13 at 14:27
On a 64bit platform, it should default to 64bits. However, you can pass following compiler args: `-m32`, `-m64`. The bit width of a platform primarily designates the available address space, not values. However, to handle addresses efficiently, there are also according registers. An `int` can be 32 or 64bits, whereas the C99 types with explicit width don't vary. – Sam Nov 17 '13 at 14:27
3

@Parad0x13 The x86-64 64-bit instruction set allows to manipulate 32-bit registers directly, and with shorter opcodes than for 64-bit registers. – Pascal Cuoq Nov 17 '13 at 14:33
I see, more research on my part is needed then since I clearly did not understand how things worked. I assumed (for instance) that on a 64 bit processor all registers were 64bit – Parad0x13 Nov 17 '13 at 14:44
1

Another, related question: http://stackoverflow.com/q/8948918/1175253 | AMD64 Register layout: http://en.wikipedia.org/wiki/X86-64#Architectural_features – Sam Nov 17 '13 at 14:46
Thanks Sam, research will be accomplished and your insight is greatly appreciated – Parad0x13 Nov 17 '13 at 14:55

why a uint32_t vs uint64_t speed difference?

1 Answers1