21

While I was reading tips in C, I have seen this tip here http://www.cprogramming.com/tips/tip/multiply-rather-than-divide but I am not sure. I was told both multiply and divide are slower and time consuming and requires many cycles.

and I have seen people often use i << 2 instead of i x 4 since shifting is faster.

Is it a good tip using x0.5 or /2 ? or however modern compilers do optimize it in a better way?

halfer
  • 19,824
  • 17
  • 99
  • 186
niko
  • 9,285
  • 27
  • 84
  • 131
  • 12
    `i * 0.5` is not always the same as `i / 2` (if `i` is an `int`, the division is integral and not floating point) – SheetJS Aug 10 '13 at 18:28
  • 7
    You should write your code clearly (`i * 4` and `i / 2`) first and turn on compiler optimizations. Compilers are very good at these optimizations. – Blastfurnace Aug 10 '13 at 18:34
  • 1
    If profiling shows a performance bottleneck and you've selected better algorithms and data structures, only then start trying and testing these micro-optimizations. It's a more effective use of your time. – Blastfurnace Aug 10 '13 at 18:39
  • 2
    One place where such optimizations aren't applied is when doing integer div/mod/mul by a non-constant. There is still some value in teaching these hand-optimizations. FP math will often not have these optimizations performed either, unless unsafe optimizations are turned on with the compiler. – Cory Nelson Aug 10 '13 at 18:39

5 Answers5

27

It's true that some (if not most) processors can multiply faster than performing a division operation, but, it's like the myth of ++i being faster than i++ in a for loop. Yes, it once was, but nowadays, compilers are smart enough to optimize all those things for you, so you should not care about this anymore.

And about bit-shifting, it once was faster to shift << 2 than to multiply by 4, but those days are over as most processors can multiply in one clock cycle, just like a shift operation.

A great example of this was the calculation of the pixel address in VGA 320x240 mode. They all did this:

address = x + (y << 8) + (y << 6)

to multiply y with 320. On modern processors, this can be slower than just doing:

address = x + y * 320;

So, just write what you think and the compiler will do the rest :)

Mark A. Donohoe
  • 28,442
  • 25
  • 137
  • 286
huysentruitw
  • 27,376
  • 9
  • 90
  • 133
  • 1
    No processor I know of multiplies in 1 cycle (source: http://instlatx64.atw.hu/ ). (however, multiplication by 4 will probably be optimized to a shift by the compiler anyway) – harold Aug 10 '13 at 18:36
  • @harold: The latency may be > 1 cycle, but the throughput should be 1 cycle/mult on most modern CPUs... – Oliver Charlesworth Aug 10 '13 at 18:39
  • @OliCharlesworth yes, and shifts (by a constant) usually have a throughput of two shifts / cycle – harold Aug 10 '13 at 18:42
  • 1
    The myth is true: ++i is faster than i++, at least when i is a complex object. Quoting cppreference,.com: "Because a temporary copy of the object is constructed during post-increment and post-decrement, pre-increment or pre-decrement operators are usually more efficient in contexts where the returned value is not used". – pixelgrease Aug 16 '17 at 22:12
  • 1
    I want to see a recent compiler that isn 't smart enough to optimize this after it detects the temporary value is never used. – huysentruitw Aug 16 '17 at 23:01
  • @Housy In C++, with custom `operator++`, that optimization is not correct. In C, sure. – Davislor Sep 04 '18 at 06:57
  • 1
    @pixelgrease Although this is a C question. – Davislor Sep 04 '18 at 07:05
  • Is this true? Unity discusses bit-shifting in C# as an optimization in their docs, so it seems odd that this would always be the case. – Mike Pandolfini Oct 13 '22 at 20:03
  • @MikePandolfini can you provide a link to those docs? – huysentruitw Oct 14 '22 at 18:03
16

I find that this service is invaluable for testing this sort of stuff:

http://gcc.godbolt.org/

Just look at the final assembly. 99% of the time, you will see that the compiler optimises it all to the same code anyway. Don't waste the brain power!

In some cases, it is better to write it explicitly. For example, 2^n (where n is a positive integer) could be written as (int) pow( 2.0, n ) but it is obviously better to use 1<<n (and the compiler won't make that optimisation for you). So it can be worth keeping these things in the back of your mind. As with anything though, don't optimise prematurely.

Dave
  • 44,275
  • 12
  • 65
  • 105
2
  1. "multiply by 0.5 rather than divide by 2" (2.0) is faster on fewer environments these days than before, primarily due to improved compilers that will optimize the code.

  2. "use i << 2 instead of i x 4" is faster in fewer environments for similar reasons.

  3. In select cases, the programmer still needs to attend to such issues, but it is increasingly rare. Code maintenance continues to grow as a dominate issue. So use what makes the most sense for that code snippet: x*0.5, x/2.0, half(x), etc.

  4. Compilers readily optimize code. Recommend you code with high level issues in mind. E. g. Is the algorithm O(n) or O(n*n)?

  5. The important thought to pass on is that best code design practices evolve and variations occur amongst environments. Be adaptable. What is best today may shift (or multiply) in the future.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
1

Many CPUs can perform multiplication in 1 or 2 clock cycles but division always takes longer (although FP division is sometimes faster than integer division).

If you look at this answer How can I compare the performance of log() and fp division in C++? you will see that division can exceed 24 cycles.

Why does division take so much longer than multiplication? If you remember back to grade school, you may recall that multiplication can essentially be performed with many simultaneous additions. Division requires iterative subtraction that cannot be performed simultaneously so it takes longer. In fact, some FP units speed up division by performing a reciprocal approximation and multiplying by that. It isn't quite as accurate but is somewhat faster.

Community
  • 1
  • 1
pukingminion
  • 117
  • 10
1

If you are working with integers, and you expect to get an integer as result, it's better to use / 2, this way avoids unnecesary conversions to/from float

Miguel Prz
  • 13,718
  • 29
  • 42
  • 1
    If you are working with integers and expect an integer as result, `>> 1` is superior to `/ 2`. The compiler cannot replace the latter with the former because they aren't equivalent for negative numbers (except when the result is an integer, knowledge that the programmer has but the compiler may not be able to infer) – Pascal Cuoq Aug 10 '13 at 18:38
  • @PascalCuoq that's true, but the difference disappears if you are working with `unsigned int`s, for obvious reasons. And if you're working with signed `int`s, you probably don't want incorrect behaviour for negative numbers. – Dave Aug 10 '13 at 18:41
  • It's not incorrect if you “expect to get an integer as a result”. – Pascal Cuoq Aug 10 '13 at 18:48
  • @PascalCuoq ah I see what you mean (I didn't follow at first because, of course, you always get an integer as a result!) – Dave Aug 10 '13 at 20:55