7

The first version does an optimisation by moving a value from memory to a local variable. The second version does not.

I was expecting the compiler might choose to do the localValue optimisation here anyway and not read and write the value from memory for each iteration of the loop. Why doesn't it?

class Example
{
    public:
        void processSamples(float * x, int num) 
        {
            float localValue = v1;

            for (int i = 0; i < num; ++i)
            {
                x[i] = x[i] + localValue;
                localValue = 0.5 * x[i];
            }

            v1 = localValue;
        }

        void processSamples2(float * x, int num)
        {

            for (int i = 0; i < num; ++i)
            {
                x[i] = x[i] + v1;
                v1 = 0.5 * x[i];
            }

        }

    float v1;
};

processSamples assembles to code like this:

.L4:
  addss xmm0, DWORD PTR [rax]
  movss DWORD PTR [rax], xmm0
  mulss xmm0, xmm1
  add rax, 4
  cmp rax, rcx
  jne .L4

processSamples2 to this:

.L5:
  movss xmm0, DWORD PTR [rax]
  addss xmm0, DWORD PTR example[rip]
  movss DWORD PTR [rax], xmm0
  mulss xmm0, xmm1
  movss DWORD PTR example[rip], xmm0
  add rax, 4
  cmp rax, rdx
  jne .L5

As the compiler doesn't have to worry about threads (v1 isn't atomic). Can't it just assume nothing else will be looking at this value and go ahead and keep it in a register while the loop is spinning?

See https://godbolt.org/g/RiF3B4 for the full assembly and a selection of compilers to choose from!

JCx
  • 2,689
  • 22
  • 32
  • The example you have linked has a further problem. `v1` is uninitialised when first used. This is UB and will cause gcc and clang to do wacky things at optimisation time. – Richard Hodges Oct 16 '17 at 13:18
  • Oh yeah - fair point - it's not real code though I just wanted to show the problem i was interested in and the starting point for v1 didn't really matter. The actual code is a tiny bit more refined. – JCx Oct 16 '17 at 14:18
  • interesting and cautionary reading: https://blog.regehr.org/archives/759 – Richard Hodges Oct 16 '17 at 15:17

1 Answers1

11

Because of aliasing: v1 is a member variable, and it could be that x points at it. Thus, one of the writes to the elements of x might change v1.

In C99, you can use the restrict keyword on a function argument of pointer type to inform the compiler that it doesn't alias anything else that is in the scope of the function. Some C++ compilers also support it, although it is not standard. (Copied from one of my comments.)

Aasmund Eldhuset
  • 37,289
  • 4
  • 68
  • 81
  • Wow - ok - so is there some way of telling it that isn't going to be the case? – JCx Oct 16 '17 at 12:48
  • 2
    @JCx: In C99, you can use the `restrict` keyword on a function argument of pointer type to inform the compiler that it doesn't alias anything else that is in the scope of the function. [Some C++ compilers also support it](https://stackoverflow.com/questions/776283/what-does-the-restrict-keyword-mean-in-c#1965502), although it is not standard. – Aasmund Eldhuset Oct 16 '17 at 12:51
  • 2
    Okay - tested with clang. Restrict sorted it out. Nice :) – JCx Oct 16 '17 at 12:53
  • 1
    @AasmundEldhuset afaik they are considered evil for different reasons and I dont see how a smart pointer would help here – 463035818_is_not_an_ai Oct 16 '17 at 12:55
  • 1
    I am ridiculously overexcited about this working ... it's going to make our filter code pretty tidy. – JCx Oct 16 '17 at 12:57
  • Yeah, we are all over smart pointers in our UI code, but not the DSP code where they are just dangerous :) And I think it'd just obfuscate the problem further here...as if things aren't subtle enough already. – JCx Oct 16 '17 at 12:57
  • 1
    it seems that creating a temporary v1 on the stack for re-assignment to v1 at the end of the loop also resolves the aliasing... but it's not clear to me why the compiler can assume that x isn't pointing to the stack? – mark Oct 16 '17 at 13:00
  • 2
    @mark - A parameter cannot point to a local variable, because the local didn't exist when the function was called. Compilers are really smart! – Bo Persson Oct 16 '17 at 13:11
  • That's a good point, although it could certainly technically happen (pointer running into newly-used stack space), it's unlikely to be a realistic scenario. That being said, the local variable copy might be the more portable approach than restrict... – mark Oct 16 '17 at 13:26
  • @mark: That would be undefined behavior, and the compiler would be allowed to do anything it chooses (in other words, the compiler is allowed to assume that it won't happen). I agree on portability, though. – Aasmund Eldhuset Oct 16 '17 at 14:28
  • @tobi303: I was thinking more along the lines of using `vector` (where I assume the library authors have pulled all available tricks to ensure its performance), but that might of course not be possible in the OP's situation. I agree about smart pointers not solving aliasing. – Aasmund Eldhuset Oct 16 '17 at 14:29
  • @AasmundEldhuset if you pass the vector as reference you still dont know if it is aliasing the member. Imho the evilness of pointers really is orthogonal to the issue of aliasing. – 463035818_is_not_an_ai Oct 16 '17 at 14:32
  • @tobi303: The vector itself cannot be aliasing a float, and I was guessing that the vector's internal float array has been marked as not an alias by whichever means the compiler offers its library authors, and that you might therefore with some probability get a portable aliasing protection. But you're right that this is not the primary evilness of pointers. Would you like me to delete my comment on that, or keep it so that your response to it keeps making sense? – Aasmund Eldhuset Oct 16 '17 at 14:36
  • @AasmundEldhuset if OP changes parameter to be a vector then they should do the same to the member. Of course if it is different types there is no aliasing. The internal array cannot be marked to be protected from aliasing, because it can be aliased in general e.g. `void foo(vector& a,vector& b); foo(a,a);`. Anyhow I think in fact we dont disagree. – 463035818_is_not_an_ai Oct 16 '17 at 14:44
  • @tobi303: Ah, you're right! I withdraw my suggestion to use vector. :-) – Aasmund Eldhuset Oct 16 '17 at 18:55
  • @AasmundEldhuset I am always pro suggesting vector, I am just a bit allergic to calling stuff "evil" (even for pointers) and here it was really misplaced – 463035818_is_not_an_ai Oct 16 '17 at 18:59
  • @tobi303: I deleted my comment. – Aasmund Eldhuset Oct 17 '17 at 11:49