21

At the GoingNative event, during the Interactive Panel on Day2 at the 9 minute mark, Chandler Carruth says:

Pointers create aliasing problems. They slow down your binaries not speed them up.

What does this mean? Can this be illustrated with a (simple) example?

StackedCrooked
  • 34,653
  • 44
  • 154
  • 278

4 Answers4

31

Aliasing affects performance by preventing the compiler from doing certain optimizations. For example:

void foo(int *array,int *size,int *value) {
    for(int i=0;i<*size;++i) {
        array[i] = 2 * *value;
    }
}

Looking at this code you might expect that the compiler could load *value once outside the loop and then set every element in the array to that value very quickly. But this isn't the case due to aliasing. Because *value could be an alias for an element of the array it could change on any given iteration. Therefore the code has to load the value every single iteration, resulting in a potentially large slowdown.

If the variables could not alias then the above code would be equivalent to the following:

void foo(int *array,int size,int value) {
    for(int i=0;i<size;++i) {
        array[i] = 2 * value;
    }
}

Using LLVM's online demo to get the generated code, here are the different results:

1) With aliasing

foo:                                    # @foo
    .cfi_startproc
# BB#0:
    cmpl    $0, (%rsi)
    jle .LBB0_3
# BB#1:
    xorl    %eax, %eax
    .align  16, 0x90
.LBB0_2:                                # %.lr.ph
                                        # =>This Inner Loop Header: Depth=1
    movl    (%rdx), %ecx
    addl    %ecx, %ecx
    movl    %ecx, (%rdi,%rax,4)
    incq    %rax
    cmpl    (%rsi), %eax
    jl  .LBB0_2
.LBB0_3:                                # %._crit_edge
    ret
    .size   foo, .Ltmp1-foo
    .cfi_endproc
.Leh_func_end0:

2) Without aliasing

foo:                                    # @foo
    .cfi_startproc
# BB#0:
    testl   %esi, %esi
    jle .LBB0_3
# BB#1:                                 # %.lr.ph
    addl    %edx, %edx
    .align  16, 0x90
.LBB0_2:                                # =>This Inner Loop Header: Depth=1
    movl    %edx, (%rdi)
    addq    $4, %rdi
    decl    %esi
    jne .LBB0_2
.LBB0_3:                                # %._crit_edge
    ret
    .size   foo, .Ltmp1-foo
    .cfi_endproc
.Leh_func_end0:

You can see that the version with aliasing has to do more work in the loop body (between the labels LBB0_2 and LBB0_3).

bames53
  • 86,085
  • 15
  • 179
  • 244
  • 1
    would const keyword helps? for example, `foo(..., const int * value)` instead, which implies that the content of `*value` would not change. – bumfo Nov 01 '16 at 08:53
  • 5
    @bumfo No, I wouldn't expect that to improve anything in this case, because `const` doesn't actually imply that the value won't be changed. All it says is that the function won't change it through that pointer. (Although technically `const` doesn't even mean that much.) – bames53 Nov 02 '16 at 03:24
12

The type of problem Chandler was talking about can be easily illustrated with a simplified strcpy:

char *stpcpy (char * dest, const char * src);

When writing an implementation of this, you might assume that the memory pointed to by dest is completely separate from the memory pointed to by src. The compiler) might want to optimize it by reading a block of characters from the string pointed to by src, and writing all of them at once into dest. But if dest pointed to one byte ahead of src, the behaviour of this would differ from a simple character-by-character copy.

Here the aliasing problem is that src can alias dest, and the generated code must be made less efficient than it could be if src wasn't allowed to alias dest.

The real strcpy uses an extra keyword, Restrict (which is technically only part of C, not C++, that tells the compiler to assume that src and dest do not overlap, and this allows the compiler to generate much more efficient code.


Here's an even simpler example where we can see a big difference in the assembly:

void my_function_1(int* a, int* b, int* c) {
    if (*a) *b = *a;
    if (*a) *c = *a;
}

void my_function_2(int* __restrict a, int* __restrict b, int* __restrict c) {
    if (*a) *b = *a;
    if (*a) *c = *a;
}

Assume that this is a simplification of a function where it actually made sense to use two if-statements rather than just if (*a) { *b=*a; *c=*a; }, but the intent is the same.

We may assume when writing this that a != b because there's some reason why it would make no sense for my_function be used like that. But the compiler can't assume that, and does a store of b and a re-load of a from memory before executing the second line, to cover the case where b == a:

0000000000400550 <my_function_1>:
  400550:       8b 07                   mov    (%rdi),%eax
  400552:       85 c0                   test   %eax,%eax                 <= if (*a)
  400554:       74 0a                   je     400560 <my_function_1+0x10>
  400556:       89 06                   mov    %eax,(%rsi)
  400558:       8b 07                   mov    (%rdi),%eax
  40055a:       85 c0                   test   %eax,%eax                 <= if (*a)
  40055c:       74 02                   je     400560 <my_function_1+0x10>
  40055e:       89 02                   mov    %eax,(%rdx)
  400560:       f3 c3                   repz retq

If we remove potential for aliasing by adding __restrict, the compiler generates shorter and faster code:

0000000000400570 <my_function_2>:
  400570:       8b 07                   mov    (%rdi),%eax
  400572:       85 c0                   test   %eax,%eax
  400574:       74 04                   je     40057a <_Z9my_function_2PiS_S_+0xa>
  400576:       89 06                   mov    %eax,(%rsi)
  400578:       89 02                   mov    %eax,(%rdx)
  40057a:       f3 c3                   repz retq
Community
  • 1
  • 1
je4d
  • 7,628
  • 32
  • 46
4

Consider the following function:

void f(float* lhs, float* rhs, float* out, int size) {
    for(int i = 0; i < size; i++) {
        out[i] = *lhs + *rhs;
    }
}

What's the fastest version of this function? Probably, you hoist *lhs + *rhs out of the loop. The problem is what happens when the pointers alias. Imagine what that optimization does if I call it like this:

float arr[6] = { ... };
f(arr, arr + 1, arr, 6);

As you can see, the problem is that *lhs + *rhs cannot be hoisted out of the loop, because out[i] modifies their values. In fact, the compiler can't hoist any logic out of the loop. So the compiler cannot perform the "obvious" optimization, because if the parameters alias the logic is now incorrect. However, if the floats are taken by value, then the compiler knows they can't alias and can perform the hoist.

Of course, this function is pretty silly, but it demonstrates the point.

Puppy
  • 144,682
  • 38
  • 256
  • 465
1

a pointer is a value that represents a memory address sometimes 2 pointers can represent the same memory address thats what aliasing is

int * p;
*p = 5;

int * alias;
alias = p;

the variable alias is an alias of p and *alias is equal to 5 if you change *alias then *p changes along with it

  • 1
    how does this slows down the binary? – unexplored Mar 14 '12 at 20:05
  • @unexplored Indirection slows down binaries because you don't know what the actual data is until you fetch where the data is. To get the data you need to fetch the address of the data and then you can finally fetch the data which is memory requests (unless cached). – Jesus Ramos Mar 14 '12 at 20:07
  • @JesusRamos correct me if I am wrong, but wouldn't it be the same without indirection? I mean when dereferencing `p` in this case, the data is not known until is fetched either. – unexplored Mar 14 '12 at 20:15
  • I think it has something to do with the compiler optimizer. – Sam I am says Reinstate Monica Mar 14 '12 at 20:18
  • @unexplored Pointers don't allow for optimizations that could normally happen in the case. If you have two pointers to the same thing the compiler will not try to follow them because the results of modification are undefined. – Jesus Ramos Mar 14 '12 at 20:18
  • @JesusRamos Could you clarify what you mean? Under which conditions is it "undefined" to "have two pointers to the same thing"? Aren't explicit allowances made for this being legal, albeit pessimising, so long as the pointers are of compatible types, and illegal if they are not so? I think that's probably what you were saying, but it's not clear. – underscore_d May 30 '17 at 17:54