30

Does typecasting in C/C++ result in extra CPU cycles?

My understanding is that is should consume extra CPU cycles atleast in certain cases. Like typecasting from float to integer where the CPU should require to convert a float structure to integer.

float a=2.0;
int b= (float)a;

I would like to understand the cases where it would/would not consume extra CPU cycles.

Vishal
  • 3,178
  • 2
  • 34
  • 47
  • 9
    It depends on the type of cast. – JoeG May 14 '13 at 09:23
  • 5
    Since you've tagged this C++, I'd suggest the use of C++-style cast, namely `static_cast`, `dynamic_cast`, `reinterpret_cast` and `const_cast`. – JBL May 14 '13 at 09:25
  • Even barring all language-related details and minutia, whether a cast is "free" or not is probably highly platform-dependent. For example, on x86, most integer casts from larger to smaller types are free, but this may not be the case on another platform with different characteristics. – yzt May 14 '13 at 09:29
  • +1 Good question, @JoeGauterin does `(size_t)int` (typicaly used for malloc while looping) consume extra cicles? – David Ranieri May 14 '13 at 09:45
  • @DavidRF:Depends on how `size_t` and `int` are defined on your local platform and how good the optimiser on your compiler is. If in doubt, check the assembly. – Jack Aidley May 14 '13 at 10:59

4 Answers4

24

I would like to say that "converting between types" is what we should be looking at, not whether there is a cast or not. For example

 int a = 10;
 float b = a; 

will be the same as :

 int a = 10;
 float b = (float)a;

This also applies to changing the size of a type, e.g.

 char c = 'a';
 int b = c; 

this will "extend c into an int size from a single byte [using byte in the C sense, not 8-bit sense]", which will potentially add an extra instruction (or extra clockcycle(s) to the instruction used) above and beyond the datamovement itself.

Note that sometimes these conversions aren't at all obvious. On x86-64, a typical example is using int instead of unsigned int for indices in arrays. Since pointers are 64-bit, the index needs to be converted to 64-bit. In the case of an unsigned, that's trivial - just use the 64-bit version of the register the value is already in, since a 32-bit load operation will zero-fill the top part of the register. But if you have an int, it could be negative. So the compiler will have to use the "sign extend this to 64 bits" instruction. This is typically not an issue where the index is calculated based on a fixed loop and all values are positive, but if you call a function where it is not clear if the parameter is positive or negative, the compiler will definitely have to extend the value. Likewise if a function returns a value that is used as an index.

However, any reasonably competent compiler will not mindlessly add instructions to convert something from its own type to itself (possibly if optimization is turned off, it may do - but minimal optimization should see that "we're converting from type X to type X, that doesn't mean anything, lets take it away").

So, in short, the above example is does not add any extra penalty, but there are certainly cases where "converting data from one type to another does add extra instructions and/or clockcycles to the code".

Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
  • 1
    it doesn't always matter for changing the size, as in a general purpose register there are aliases for parts of the register... of course it could... but even on the stack, because of alignments you can typically just grab a bigger version of an unsigned value. – Grady Player May 14 '13 at 16:09
  • 1
    There are two problems with making a value larger: 1. What IS in the "upper part", 2. What SHOULD be in the upper part. Unsigned values, on x86-64 work, by design, because the upper part of the register is filled with zeros. But if the register is supposed to be sign extended, it requires filling with whatever is the sign, which means an extra instruction is needed. Of course, to read 64-bits from the stack probably won't work, as there is absolutely no guarantee that the upper 32 bits are actually the value you want. – Mats Petersson May 14 '13 at 16:14
15

It'll consume cycles where it alters the underlying representation. So it will consume cycles if you convert a float to an int or vice-versa. Depending on architecture casts such as int to char or long long to int may or may not consume cycles (but more often than not they will). Casting between pointer types will only consume cycles if there is multiple inheritance involved.

Marco Bonelli
  • 63,369
  • 21
  • 118
  • 128
Jack Aidley
  • 19,439
  • 7
  • 43
  • 70
  • 2
    Just a minor comment. long long to int should not consume cycles only in cases where sizeof(long long)==sizeof(int). – Vishal May 14 '13 at 09:35
  • @Vishal This isn't actually true, because some architectures allow you to re-interpret a register as 32-bit or 64-bit as you use them. ARM is one such example. The cast does not usually need to be done separately. – Jack Aidley May 25 '19 at 09:26
10

There are different types of casts. C++ has different types of cast operators for the different types of casts. If we look at it in those terms, ...

static_cast will usually have a cost if you're converting from one type to another, especially if the target type is a different size than the source type. static_casts are sometimes used to cast a pointer from a derived type to a base type. This may also have a cost, especially if the derived class has multiple bases.

reinterpret_cast will usually not have a direct cost. Loosely speaking, this type of cast doesn't change the value, it just changes how it's interpreted. Note, however, that this may have an indirect cost. If you reinterpret a pointer to an array of bytes as a pointer to an int, then you may pay a cost each time you dereference that pointer unless the pointer is aligned as the platform expects.

const_cast should not cost anything if you're adding or removing constness, as it's mostly an annotation to the compiler. If you're using it to add or remove a volatile qualifier, then I suppose there may be a performance difference because it would enable or disable certain optimizations.

dynamic_cast, which is used to cast from a pointer to a base class to a pointer to a derived class, most certainly has a cost, as it must--at a minimum--check if the conversion is appropriate.

When you use a traditional C cast, you're essentially just asking the compiler to choose the more specific type of cast. So to figure out if your C cast has a cost, you need to figure out what type of cast it really is.

Adrian McCarthy
  • 45,555
  • 16
  • 123
  • 175
  • 1
    Yeah it really needs to be 2 questions one for C code, which is typically a distinction at the assembly level and C++ code which could end up executing initializers and template code and could be a 1000 operations for a cast that looks simple enough, and is automatic from the programmers perspective. – Grady Player May 14 '13 at 16:11
6

DL and enjoy Agner Fog's manuals:
http://www.agner.org/optimize/
1. Optimizing software in C++: An optimization guide for Windows, Linux and Mac platforms
It is huge PDF but for start you can check out:

14.7 Don't mix float and double
14.8 Conversions between floating point numbers and integers

NoSenseEtAl
  • 28,205
  • 28
  • 128
  • 277