Sorry if the question sounds stupid. I'm only vaguely cognizant of the issue of data alignment and have never done any 64-bit programming. I'm working on some 32-bit x86 code right now. It frequently accesses an array of int. Sometimes one 32-bit integer is read. Sometimes two or more are read. At some point I'd like to make the code 64-bit. What I'm not sure is whether I should declare this int array as int
or long int
. I would rather keep the width of the integer the same, so I don't have to worry about differences. I'm sort of worried though that reading/writing off an address that isn't aligned to the natural word might be slow.

- 7,242
- 4
- 31
- 40
-
1If you want fixed-width integers try: `int32_t` in `
` – Mysticial Sep 16 '12 at 20:52 -
`int` is the natural type for the architecture. Unless you have a good reason to use a different type, don't. – Pete Becker Sep 16 '12 at 20:54
-
@PeteBecker No it isn't. `int` is still only 32-bits on most systems today. – Mysticial Sep 16 '12 at 20:54
-
AFAIK there is no penalty to accessing a dword that is aligned by 4 but not by 8. – harold Sep 16 '12 at 20:56
-
@Mysticial - from the C++ standard: "Plain ints have the natural size suggested by the architecture of the execution environment". If "most systems today" don't do that, there's a serious problem with system design. (but note that the "system" doesn't define integer sizes; the compiler does). – Pete Becker Sep 16 '12 at 20:57
-
@PeteBecker I think the reason why `int` never got promoted to 64-bit on today's 64-bit machines is partially because of backwards compatibility with code that relied on them being 32-bit. – Mysticial Sep 16 '12 at 21:01
-
Actually I think that the reason is that on intel processors, even for 64 bits code the default operand size is 32 bits. So in terms of arithmetic, on intel the natural size is 32 bits even for 64 bits code. In terms of addresses, of course, 64 bits code will use 64 bits addresses. – Analog File Sep 16 '12 at 21:55
-
@Mystical - An `int` is still 32 bits on today's 64 bit processors because using `int` is faster, a whole lot faster, than is using `long`. Try it. – David Hammen Sep 16 '12 at 22:15
-
@Analog File So arithmetics using long long in 64-bit mode is slower than with long? – cleong Sep 16 '12 at 22:40
-
@PeteBecker -- the C++ standard is fiction. – Russell Borogove Sep 16 '12 at 22:57
-
@cleong it depends on what long long is and what long is. – Analog File Sep 16 '12 at 23:49
4 Answers
Misalignment penalties only occur when the load or store crosses an alignment boundary. The boundary is usually the smaller of:
- The natural word-size of the hardware. (32-bits or 64-bit*)
- The size of the data-type.
If you're loading a 4-byte word on a 64-bit (8-byte) architecture. It does not need to be 8-byte aligned. It only needs to be 4-byte aligned.
Likewise, if you're loading a 1-byte char on any machine, it doesn't need to be aligned at all.
*Note that SIMD vectors can imply a larger natural word-size. For example, 16-byte SSE still requires 16-byte alignment on both x86 and x64. (barring explicit misaligned loads/stores)
So in short, no you don't have to worry about data-alignment. The language and the compiler tries pretty hard to prevent you from having to worry about it.
So just stick with whatever datatype makes the most sense for you.

- 464,885
- 45
- 335
- 332
64-bit x86 CPUs are still heavily optimized for efficient manipulation of 32-bit values. Even on 64-bit operating systems, accessing 32-bit values is at least as fast as accessing 64-bit values. In practice, it will actually be faster because less cache space and memory bandwidth is consumed.

- 179,497
- 17
- 214
- 278
There is a lot of good information available here: Performance 32 bit vs. 64 bit arithmetic
Even more information https://superuser.com/questions/56540/32-bit-vs-64-bit-systems, where the answer claims to have seen the worst slow down at 5% (from an application perspective, not individual operations).
The short answer is no, you won't take a performance hit.

- 1
- 1

- 1,137
- 9
- 19
Whenever you access any memory location an entire cache line is read into L1 cache, and any subsequent access to anything in that line is as fast as possible. Unless your 32-bit access crosses a cache line (which it won't if it's on a 32-bit alignment) it will be as fast as a 64-bit access.

- 299,747
- 42
- 398
- 622
-
Not quite. Accessing 1 value will be the same. If you access *another* 32-bit value that is in the same cache line it will already be there - you even state this with "subsequent...". Using smaller data sizes is generally more cache friendly due to loading more data elements per cache line. – phkahler Sep 17 '12 at 01:06