Replace right shift by multiplication

Question

I know that it is possible to use the left shift to implement multiplication by the power of two (x << 4 = x * 16). Also, it is trivial to replace the right shift by division by a power of two (x >> 5 = x / 32).

I am wondering is it possible to replace the right shift with multiplication? It seems to be not possible in the general case, but my question is limited to modulo 2^32 and 2^64 arithmetic (unsigned 32-bit and 64-bit values). Also, maybe it can be done if we can add other cheap instructions like + and - in addition to * to emulate the right bit shift?

I assume exotic architecture where the right shift is more expensive than other arithmetic (similar to division).

uint64_t foo(uint64_t x) {
    return x >> 3; // how to avoid using right shift here?
}

There is a similar question How to perform right shifting binary multiplication? that asks how to replace multiplication of two unsigned numbers by right shift. Basically, it uses a loop internally. However, maybe if the second number is a constant, this loop can be avoided (or at least unrolled to a shorter fragment)?

I'm not sure I understand what you are asking. You've already acknowledged that shifting right by `n` is equivalent to dividing by `2^n`, so it's equivalent to multiplying by `(1/2)^n`. What is unclear to you? — mkrieger1, Jul 29 '21 at 09:31
@mkrieger1 I added a code fragment that should explain it better. Basically, I operate on unsigned numbers and can't represent factors like (1/2) there. — Curious, Jul 29 '21 at 09:36
What's the problem with shifting right if that is what you want to do? I don't understand how this is an issue. — mkrieger1, Jul 29 '21 at 09:37
@mkrieger1 Yes, integer multiplication takes less. I understand your doubt because it is not true for intel x86, but what if we consider such exotic architecture where this is true? — Curious, Jul 29 '21 at 09:40
For example, which architecture? Even if it were the case, you should leave it to the compiler to make such optimizations and use the most appropriate CPU instructions. — mkrieger1, Jul 29 '21 at 09:40
I don't mean any particular widespread architecture. I mean that theoretically, you can design it for solving specific tasks (like ASICs and FPGAs). — Curious, Jul 29 '21 at 09:43
Shifting is a very light operation. I don't think you can find a cheaper one by mixing aritmetic. Also note that some people are suggesting multipling by `float`s that's just more CPU consuming and you'll mix your types. It is an interesting thought , but I'm pretty sure you won't find anything better than shifting or dividing by 2^n — Ivan, Jul 29 '21 at 09:44
Seconding Ivan. People (and optimizing compilers...) **replace** multiplication with left-shifts because shifting is cheaper (by an order of magnitude, and not only on x86). I can't picture any architecture where some "magic" multiplication "plus something else to get it right" can ever beat the right shift performance-wise. — DevSolar, Jul 29 '21 at 11:02
Multiplying is an operation that includes repeated shifting and addition. E.g. a*1101 is (a<<3 + a<<2 + a). There will be no architecture on which that is faster than a single shift without addition. — MSalters, Jul 29 '21 at 11:28

harold · Accepted Answer · 2021-07-29T11:38:23.000

"Multiply-high" aka high-mul, hmul, mulh, etc, can be used to emulate a shift-right with a constant count. Usually that's not a good trade. It's also hardly related to C++.

Normal multiplication (putting floating point stuff aside) cannot be used to implement a shift-right.

my question is limited to modulo 2^32 and 2^64 arithmetic

It doesn't help. You can use that property to "unmultiply" (sort of like divide, except not really) by odd numbers, for example if b = 5 * a then a = b * 0xCCCCCCCD, using the modular multiplicative inverse. The number being inverted must be relatively-prime relative to the modulus. Since the modulus is a power of two, the "divisor" here cannot be a power of two (except 1, but that does nothing), so a shift-right cannot be done this way.

Another way to look at it (probably simpler), is that what a multiplication does is conditionally add together a bunch of left-shifted versions of the multiplicand. Only left-shift versions, not right-shifted versions. Which of those shifted versions are selected by the multiplier doesn't matter, there are no right-shifted versions to select.

Nice! I didn't think about using multiply-high. It solves my problem for 32-bit unsigned numbers when 64-bits of multiplication are available. — Curious, Jul 29 '21 at 11:33

Replace right shift by multiplication

1 Answers1