Questions tagged [strength-reduction]

Replacing expensive operations with cheaper ones. Classic example: x = n * i in a loop becomes x += n.

Strength reduction is the practice of optimizing by replacing expensive operations with cheaper ones. (See https://en.wikipedia.org/wiki/Strength_reduction).

Classic example:

for (int i=0 ; i<1024 ; i++) {
    a[i] = b[i * 33];
}

can be transformed by a compiler, or by a human, to code that implements it as

int i33 = 0;
for (int i=0 ; i<1024 ; i++) {
    a[i] = b[i33];
    i33 += 33;
}

Multiply is still somewhat more expensive than add on modern computers, and compilers go to some effort to replace them with one or two shifts+add instructions. See for example How to multiply a register by 37 using only 2 consecutive leal instructions in x-86?

Another example would be optimizing n % 4 to n & 3 (for unsigned n, or to a more elaborate sequence for signed n), because hardware division instructions are still very expensive.

7 questions
17
votes
2 answers

How can I strength reduce division by 2^n + 1?

I need to perform some integer divisions in the hot path of my code. I've already determined via profiling and cycle counting that the integer divisions are costing me. I'm hoping there's something I can do to strength reduce the divisions into…
J.S.
  • 171
  • 4
9
votes
2 answers

Why is this program not optimized away?

Consider the following, simple program (adapted from this question): #include int main(int argc, char** argv) { int mul1[10] = { 4, 1, 8, 6, 3, 2, 5, 8, 6, 7 }; // sum = 50 int mul2[10] = { 4, 1, 8, 6, 7, 9, 5, 1, 2, 3 }; // sum =…
user703016
  • 37,307
  • 8
  • 87
  • 112
8
votes
1 answer

How to multiply a register by 37 using only 2 consecutive leal instructions in x86?

Say %edi contains x and I want to end up with 37*x using only 2 consecutive leal instructions, how would I go about this? For example to get 45x you would do leal (%edi, %edi, 8), %edi leal (%edi, %edi, 4), %eax (to be returned) I cannot for…
Newbie18
  • 171
  • 1
  • 3
  • 11
7
votes
5 answers

add vs mul (IA32-Assembly)

I know that add is faster as compared to mul function. I want to know how to go about using add instead of mul in the following code in order to make it more efficient. Sample code: mov eax, [ebp + 8] #eax = x1 …
Pavitar
  • 4,282
  • 10
  • 50
  • 82
4
votes
3 answers

Strength reduction on floating point division by hand

In one of our last assignments in Computer Science this term we have to apply strength reduction on some code fragments. Most of them were just straight forward, especially with looking into compiler output. But one of them I wont be able to solve,…
Zagatho
  • 523
  • 1
  • 6
  • 22
3
votes
1 answer

Are indexes easier to vectorize than pointers?

Is there any example (e.g. on https://godbolt.org/ ) where CLang generates worse code when an algorithm expressed by pointer iterations instead of array indexes? E.g. it can vectorize/unfold in one case but can't in the other one? In simple examples…
nponeccop
  • 13,527
  • 1
  • 44
  • 106
1
vote
2 answers

Reduce this loop to an equation

This function (written in C for convenience, but this is not important to the question) determines the size of an array. I'm sure it can be converted to an if-else chain or even to an equation, but I am not clever enough to see how. (I tried to…
zwol
  • 135,547
  • 38
  • 252
  • 361