Is there a performance penalty access an array of 32-bit integers in x86-64?

Question

Sorry if the question sounds stupid. I'm only vaguely cognizant of the issue of data alignment and have never done any 64-bit programming. I'm working on some 32-bit x86 code right now. It frequently accesses an array of int. Sometimes one 32-bit integer is read. Sometimes two or more are read. At some point I'd like to make the code 64-bit. What I'm not sure is whether I should declare this int array as int or long int. I would rather keep the width of the integer the same, so I don't have to worry about differences. I'm sort of worried though that reading/writing off an address that isn't aligned to the natural word might be slow.

`int` is the natural type for the architecture. Unless you have a good reason to use a different type, don't. — Pete Becker, Sep 16 '12 at 20:54
@PeteBecker No it isn't. `int` is still only 32-bits on most systems today. — Mysticial, Sep 16 '12 at 20:54
AFAIK there is no penalty to accessing a dword that is aligned by 4 but not by 8. — harold, Sep 16 '12 at 20:56
@Mysticial - from the C++ standard: "Plain ints have the natural size suggested by the architecture of the execution environment". If "most systems today" don't do that, there's a serious problem with system design. (but note that the "system" doesn't define integer sizes; the compiler does). — Pete Becker, Sep 16 '12 at 20:57
@PeteBecker I think the reason why `int` never got promoted to 64-bit on today's 64-bit machines is partially because of backwards compatibility with code that relied on them being 32-bit. — Mysticial, Sep 16 '12 at 21:01
Actually I think that the reason is that on intel processors, even for 64 bits code the default operand size is 32 bits. So in terms of arithmetic, on intel the natural size is 32 bits even for 64 bits code. In terms of addresses, of course, 64 bits code will use 64 bits addresses. — Analog File, Sep 16 '12 at 21:55
@Mystical - An `int` is still 32 bits on today's 64 bit processors because using `int` is faster, a whole lot faster, than is using `long`. Try it. — David Hammen, Sep 16 '12 at 22:15
@Analog File So arithmetics using long long in 64-bit mode is slower than with long? — cleong, Sep 16 '12 at 22:40

Mysticial · Accepted Answer · 2012-09-16T21:11:07.237

Misalignment penalties only occur when the load or store crosses an alignment boundary. The boundary is usually the smaller of:

The natural word-size of the hardware. (32-bits or 64-bit*)
The size of the data-type.

If you're loading a 4-byte word on a 64-bit (8-byte) architecture. It does not need to be 8-byte aligned. It only needs to be 4-byte aligned.

Likewise, if you're loading a 1-byte char on any machine, it doesn't need to be aligned at all.

_{*Note that SIMD vectors can imply a larger natural word-size. For example, 16-byte SSE still requires 16-byte alignment on both x86 and x64. (barring explicit misaligned loads/stores)}

So in short, no you don't have to worry about data-alignment. The language and the compiler tries pretty hard to prevent you from having to worry about it.

So just stick with whatever datatype makes the most sense for you.

score 3 · Answer 2 · answered Sep 16 '12 at 20:59

64-bit x86 CPUs are still heavily optimized for efficient manipulation of 32-bit values. Even on 64-bit operating systems, accessing 32-bit values is at least as fast as accessing 64-bit values. In practice, it will actually be faster because less cache space and memory bandwidth is consumed.

score 1 · Answer 3 · edited May 23 '17 at 11:54

1

There is a lot of good information available here: Performance 32 bit vs. 64 bit arithmetic

Even more information https://superuser.com/questions/56540/32-bit-vs-64-bit-systems, where the answer claims to have seen the worst slow down at 5% (from an application perspective, not individual operations).

The short answer is no, you won't take a performance hit.

edited May 23 '17 at 11:54

Community

1
1

answered Sep 16 '12 at 20:56

Adam Cadien

1,137
9
19

score 1 · Answer 4 · answered Sep 16 '12 at 21:02

1

Whenever you access any memory location an entire cache line is read into L1 cache, and any subsequent access to anything in that line is as fast as possible. Unless your 32-bit access crosses a cache line (which it won't if it's on a 32-bit alignment) it will be as fast as a 64-bit access.

answered Sep 16 '12 at 21:02

Mark Ransom

299,747
42
398
622

Not quite. Accessing 1 value will be the same. If you access *another* 32-bit value that is in the same cache line it will already be there - you even state this with "subsequent...". Using smaller data sizes is generally more cache friendly due to loading more data elements per cache line. – phkahler Sep 17 '12 at 01:06

Is there a performance penalty access an array of 32-bit integers in x86-64?

4 Answers4