2

Take the following code for example:

uint32_t fg;
uint32_t bg;
uint32_t mask;
uint32_t dest;
...
dest = (fg & mask) | (bg & (~mask));

Now this fragment has all it's operands typed 32 bit unsigned ints. Using a C compiler with a 32 bit int size, no integer promotions will happen, so the entire operation is performed in 32 bits.

My problem is that for example on Wikipedia it is shown that usually even 64 bit machines get to have compilers which use a 32 bit int size. Conforming to the C standard, they wouldn't promote the operands to 64 bit ints, so potentially compiling into something having inferior performance and probably even larger code size (just assuming from how 16 bit operations are more expensive cycle and instruction size-wise on a 32 bit x86).

The primary question is: Do I have to be concerned? (I believe I may not, since with optimizations enabled a sane compiler might be able to omit the excess gunk which would show up from strictly following the C standard. Please see past the example code, and think in general where my belief may have less ground)

If it is so (that I actually have to be concerned), could you recommend some method (book, site, whatever) which covers this area? (Well, I know this is a bit out-of-bounds for SO, however I see this much less useful if I would only get a three word Yes, you do! as an answer to accept)

Jubatian
  • 2,171
  • 16
  • 22
  • 2
    "just assuming from how 16 bit operations are more expensive cycle and instruction size-wise on a 32 bit x86" Is this really so? – glglgl Nov 19 '14 at 08:40
  • @glglgl: Yes, extra size due to an operand size prefix on the instruction, and at least on older Pentiums, extra cycles to process that (I didn't check newer processors in this regard). – Jubatian Nov 19 '14 at 08:42
  • @Jubatian Are you talking specifically about x86-64 architecture, or about any 64-bit architecture? – anatolyg Nov 19 '14 at 08:59
  • @anatolyg: Probably mostly x86-64 right now and here, but rather stay general for the sake of being cross-platform. If even on one rather common architecture (such as ARM 64 bit) there is something of concern, it is worthy to mention. – Jubatian Nov 19 '14 at 09:03
  • 1
    `int` in C is supposed to be the "natural" operand size of the architecture. On an architecture where 32-bit ops take longer than 64-bit ops, `sizeof(int)*CHAR_BIT` *should* be `64`. That said, on x86-64, default operand size is still 32 bits, not 64 bits. Thus, 32 bit ops need no operand size prefix, and are not slower than 64 bit ops. – EOF Nov 19 '14 at 10:24
  • Did some experiments with my microcomputer emulator. Making the *architecture integer* type of it explicitly 64 bits retained it's functionality, while the binary size increased with about 10 percents (while there are no memory reservations of that particular type, so I am quite puzzled where this drastic increment came from). The assembly output shows wildly mixed operand size usage, most likely relating this question: http://stackoverflow.com/questions/11177137/why-do-most-x64-instructions-zero-the-upper-part-of-a-32-bit-register. Seems like 64 bit code is worse on 64 bit. Weird... – Jubatian Nov 19 '14 at 10:46
  • @Jubatian: Surely the *size* is down to choices your compiler is making? – Oliver Charlesworth Nov 19 '14 at 11:05
  • @OliverCharlesworth: Yes. Originally that *architecture integer* type was `unsigned int` (I need it unsigned in the project). I replaced it to `uint_fast32_t` from `stdint.h`, which I confirmed is 64 bits. The compiler is GCC running on 64 bit Debian. I am tempted to say that maybe the best is to use whatever `int` your compiler gives (sure the compiler's engineers know it best), and maybe it is even 64 bits on different architectures where the instruction set is designed in a more 64 bit centric manner. – Jubatian Nov 19 '14 at 11:36

1 Answers1

1

Do I have to be concerned?

No, not really. The reduced cost of reading main memory or disk usually outways the added cost of performing 32 bit operations in 64 bit registers. A 64-bit program that uses 32-bit integer arrays will often be faster than one using 64-bit integer arrays.

On the same note, when compiling it is often better to optimize for size than for speed, because the cache misses often cost more than the cpu cycles saved.

Klas Lindbäck
  • 33,105
  • 5
  • 57
  • 82
  • But the question still stands; are 32-bit ALU operations on x86-64 (as a concrete example) fundamentally slower than their 64-bit counterparts? – Oliver Charlesworth Nov 19 '14 at 08:55
  • Well, I missed those from the post. For those type of accesses (primarily memory, on disk you usually have a file format to conform with) I always plan the structure and use fixed size types. I am rather curious about operations entirely taking place within the ALU. – Jubatian Nov 19 '14 at 08:56