Why int32_t faster then int64_t?

Question

Example code:

template<typename T>
inline void Solution() {
    T n = 0;
    T k = 0;
    T s = 1;
    while (k < 500) {
        n += s++;
        k = 2;
        T lim = n;
        for (T i = 2; i < lim; i++) {
            if (n % i == 0) {
                lim = n / i;
                k += 2;
            }
        }
    }

    std::cout << "index: " << s << std::endl;
    std::cout << "value: " << n << std::endl;
}

There is a difference between calculation time when i use int32_t and int64_t (more then 2x times). So, simple question is: "Why?"

Solution<int32_t> -> 0.35s on x32 build
Solution<int64_t> -> 0.75s on x32 build

Have you looked at the generated assembly code for the two versions? — Barmar, May 08 '16 at 05:59
In 64-bit mode, the only real difference is GCC's choice of registers. `rax` et al. for `int64_t` and `eax` et al. for `int32_t`, but the instructions produced are identical. In 32-bit mode, since there are no 64-bit registers available significantly more code is produced for `int64_t`. — uh oh somebody needs a pupper, May 08 '16 at 06:12

score 2 · Answer 1 · answered May 08 '16 at 05:46

If x32 build means that your platform is 32bit then results are expected as size of the machine word is 32 bits. What is sizeof(void*) on your platform?

If it 64-bit then it could mean that your while loop doesn't fit into the instructions cache line of your cpu.

Actually profiling tools like (gprof, cachegrind, stackgrind etc) will provide more correct answer than guessing here.

score 1 · Answer 2 · answered May 08 '16 at 06:29

Print out the assembly language, especially the division.

On a 32-bit platform, the 64-bit arithmetic may need to be performed in double-word style. For example, with addition the lower 32-bit values are added first, then the higher 32-bits are added along with the carry from the first addition. The assembly language will show this.

Also, unless your platform has division instructions, the division will be performed by software. The 64-bit division will have more operations than the 32-bit, thus taking longer.

As others have said, check the instruction and data alignments for your machine. The assembly language should show this.

Why int32_t faster then int64_t?

2 Answers2