6

While working with large arrays, I am doing unsafe pointer computations like the following:

*c++ = *a++ - *b++; 

It works as expected. But for inplace operations, I need the c pointer on the right side as well:

[STAThread]
unsafe static void Main(string[] args) {

    double[] arr = new double[] { 2, 4, 6, 8, 10 };
    double scalar = 1;
    fixed (double* arrP = arr) {
        double* end = arrP + arr.Length;
        double* p = arrP;
        double* p2 = arrP; 
        while (p < end) {
            // gives: 3,5,7,9,2,4827634676971E+209
            *p++ = *p - scalar;

            // gives correct result: 1,3,5,7,9
            //*p = *p - scalar;
            //p++;
        }
    }
    Console.WriteLine(String.Join<double>(",", arr));
    Console.ReadKey(); 
}

The pointer gets incremented before the dereference happens. This is correct according to precedence rules (++ before *). But now the new value is written to the incremented address, not to the original. Why is this so?

I found this SO question: dereference and advance pointer in one statement? . But it handles the *c++ expression on the right side only. Why would the write access be different from the read access?

Also, a link to the preceedence rules for pointer types in the C# spec would be highly appreciated. Couldn't find them so far.

@EDIT: Please note, we are talking about C# here, not C or C++. Even if I expect the difference to be not too big here. Also, like in the example above, I know, the problem can be prevented by incrementing the pointer in the next code line. I want to know, why the behaviour is as described anyway.

Community
  • 1
  • 1
user492238
  • 4,094
  • 1
  • 20
  • 26
  • 3
    Why don't you just write it out in 2 lines rather than 1? – David Heffernan Apr 04 '11 at 12:11
  • 1
    Since it's not obvious or easy to read, why not break it down into multiple statements? `*p = *p - scalar; p++;` or `*p -= scalar; p++` look fine to me? – Kieren Johnstone Apr 04 '11 at 12:12
  • It gives a strange result because it is a strange statement. Seems right. – Kobi Apr 04 '11 at 12:13
  • Breaking into several lines would be the last exist only. The code is the output of a larger code generating framework. Changes are costly (but of course can be done). – user492238 Apr 04 '11 at 12:16
  • @user492238 - "Changes are costly" is a pathetic excuse. IMO, the system is poorly designed. Besides, you're using a garbage-collected language and worrying about performance? – TheCloudlessSky Apr 04 '11 at 14:50
  • @TheCloudlessSky Yes, I pay attention at costs. And yes, I care about performance :) – user492238 Apr 04 '11 at 14:59
  • Why does writing it out correctly affect performance? If you think that writing code on one line rather than two means it runs faster then your mental model of how compilers work is profoundly broken. – David Heffernan Apr 04 '11 at 15:27
  • :) funny how rumors arise. I care about performance. This is why I am using pointers at all. But the compiler of course generates the same number of instructions regardless if it was written seperately or as compound. – user492238 Apr 04 '11 at 15:40
  • 2
    What nobody has bothered to mention is that you can fix this instance of the problem with either `*p = *p++ - scalar;` or `*p++ = *p2++ - scalar;` -- the first works because of the C# order-of-side-effects rules, and the second just utilizes the `p2` pointer that your code generating framework seems to set up. (Unless of course the latter is used for something else that you elided). Does this help? – LHMathies Apr 04 '11 at 15:48
  • @LHMathies helps. Thanks. *p = *p++ - s is better, since the other variant would bring 2 increments - and therefore again diminish performance. – user492238 Apr 04 '11 at 15:57
  • @TheCloudlessSky Some people using garbage collected languages care about performance because they're writing performance oriented code. Like games. And before you ask, yes, it works. ;) – jasonh Apr 04 '11 at 21:35

1 Answers1

15

The key is this sentence right here:

The pointer gets incremented before the dereference happens. This is correct according to precedence rules (++ before *). But now the new value is written to the incremented address, not to the original. Why is this so?

The implication is that you believe that precedence and order of side effects are related. They are not. Side effects happen in order left to right, period, end of story. If you have

A().x = B() + C() * D();

then the multiplication happens before the addition, because multiplication is higher precedence. And the addition happens before the assignment because addition is higher precedence. The side effects of A(), B(), C() and D() happen in left-to-right order irrespective of precedence of the operators. Order of execution is unrelated to precedence. (Side effects may be observed to happen in a different order if you are observing from another thread due to processor cache issues, but side effects in one thread are always observed in left-to-right order.)

In your example the p++ is to the left of the right-hand side *p, and therefore the side effect of the p++ happens before the observation of the side effect on the right. More specifically, the operation of the assignment operator to a variable is:

  • evaluate the address of the variable on the left
  • evaluate the right hand side, converting it to the type of the variable if necessary
  • store the value in the variable
  • the result is the value that was stored

The first step -- evaluate the address of the variable on the left -- is what does the ++.

This is clearly defined in the C# spec; see the section on operator precedence and order of execution for details.

If this subject interests you, see my numerous articles on the details of the difference between precedence, associativity and order:

http://blogs.msdn.com/b/ericlippert/archive/tags/precedence/

If you don't understand how ++ works -- and almost no one does, unfortunately -- see this question:

What is the difference between i++ and ++i?

If you don't understand how assignment works -- and it has been surprising to me to learn that almost no one does understand how assignment works -- see:

http://blogs.msdn.com/b/ericlippert/archive/tags/simple+assignment/

http://blogs.msdn.com/b/ericlippert/archive/tags/compound+assignment/

The other answers are pointing out that in the C and C++ programming languages, the language specifications do not specify in what order side effects appear to happen if the side effect and its observation are within the same "sequence point", as they are here. In C it is permissible for the side effect of the ++ to happen at any time before the end of the statement. After the assignment, before the assignment, whenever, at the discretion of the compiler. C# does not permit that sort of lattitude. In C#, a side effect to the left is observed to have happened by the time that code to the right executes.

Also, a link to the preceedence rules for pointer types in the C# spec would be highly appreciated. Couldn't find them so far.

The spec section you want is 18.5, which states:

The precedence and associativity of the unsafe operators is implied by the grammar.

So read the grammar and work it out. Start by reading the grammar in Appendix B, section 3.

Community
  • 1
  • 1
Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • 'is implied by the grammar' I had found that sentence only before and was hoping for a more specific phrase? Anyway, helpful post. Thanks! – user492238 Apr 04 '11 at 16:01