12

It used to be the case that pre-increment would be preferred because an overloaded post-increment on a class necessitated the return of a temporary copy that represented the state of the object before the increment.

It seems this is no longer a serious concern (so long as inlining is in place), since my old C++ compiler (GCC 4.4.7) seems to optimize the following two functions down into identical code:

class Int {
    //...
public:
    Int (int x = 0);
    Int & operator ++ ();
    Int operator ++ (int) {
        Int x(*this);
        ++*this;
        return x;
    }
};

Int & test_pre (Int &a) {
    ++a;
    return a;
}

Int & test_post (Int &a) {
    a++;
    return a;
}

The resulting assembly for both functions is:

    .cfi_startproc
    .cfi_personality 0x3,__gxx_personality_v0
    pushq   %rbx
    .cfi_def_cfa_offset 16
    .cfi_offset 3, -16
    movq    %rdi, %rbx
    call    _ZN3IntppEv
    movq    %rbx, %rax
    popq    %rbx
    .cfi_def_cfa_offset 8
    ret
    .cfi_endproc

If nothing is inlined, however, there seems to still be a benefit to preferring pre-increment to post-increment, since test_post is forced to call out into operator++(int).

Let's assume operator++(int) is inlined as an idiomatic copy constructor, call to the pre-increment, and return of the copy, as illustrated above. If the copy constructor is inlined or the default copy constructor implementation, is that sufficient information for the compiler to optimize post-increment so that test_pre and test_post become identical functions? If not, what other information is required?

jxh
  • 69,070
  • 8
  • 110
  • 193
  • 5
    You Int (with capital I) might be a complicated type with an expensive post-increment, using pre-increment is a matter of good habit and style. –  May 04 '15 at 18:02
  • 6
    Non trivial types are usually faster on pre-incriment and its best practice to be *consistent* so pre-increment should also be used in preference for built-in types IMO. – Galik May 04 '15 at 18:21
  • In ye olden days of weak optimization, predecrement and postincrement were preferred. They would map directly to the autoincrement and autodecrement addressing modes of the PDP-11 and VAX. – user3344003 May 04 '15 at 21:21
  • @user3344003 I would think it had more to do with the number of registers that are required. Predecrement and preincrement would both require 2 registers where postdecrement and postincrement would only require one. (This of course assumes assignment.) Can you site the source of your statement and was this with or without assignment? – G. Allen Morris III May 04 '15 at 21:42
  • Look at a VAX or PDP-11 reference manual. "MOVL R0, -(R1)" decrements R1 by 4 and moves the contents of R0 to the memory referenced by it it. '"MOVL (R1)+, R0", moves contents referenced by R1 to R0 then increments R1. There is no corresponding +(R1) or (R2)-. They operate like you expected for a stack. – user3344003 May 04 '15 at 22:28
  • Why would you ever want to write `x++` instead of `++x`? Personally, I have never met (in real life) a programmer who could precisely explain the semantics of `x++`. Usually I get some wishy-washy "first, the value is used, and then the increment happens". But of course they have never heard of sequence points, let alone understand their implications. In Java, the semantics are almost trivial: increment, yield old value. I have talked to a lot of Java programmers, and none of them got this right. Apparently, postfix increment is too complicated for most programmers. Dare I say.. harmful? Evil? – fredoverflow May 05 '15 at 18:29
  • @fredoverflow: Well, it isn't ++C :-) – jxh May 05 '15 at 18:35
  • @fredoverflow: Anyway, the `x++` vs `++x` thing was just the hook. The real question was what is required for the compiler to generate the optimization I observed. – jxh May 05 '15 at 18:48

5 Answers5

17

Yes. It shouldn't matter for built-in types. For such types, the compiler can easily analyze the semantics and optimize them — if that doesn't change the behavior.

However, for class-type, it may (if not does) matter, because the semantic could be more complex in this case.

class X { /* code */ };

X x;

++x;
x++; 

The last two calls could be entirely different and may perform different things, just like these calls:

x.decrement(); //may be same as ++x (cheating is legal in C++ world!)
x.increment(); //may be same as x++

So dont let yourself trapped for the syntactic sugar.

Nawaz
  • 353,942
  • 115
  • 666
  • 851
  • Thanks, can you consider my amended question? The intention was that the post increment was an idiomatic implementation based on the pre increment. – jxh May 04 '15 at 18:10
  • 2
    @jxh: Your edit doesn't add much. The point is, the compiler *is free to change* your `i++` to `++i` **if** it can prove that such transformation wont change *the behavior* of the program. For built-in types, proofs are easy. For class types, it could be *really* difficult (or *practically* expensive)! – Nawaz May 04 '15 at 18:13
  • Okay, I already understood that post-increment may do much more than pre-increment. And, it is good that your answer addresses the general problem. But, given the scope of my question, can you say the compiler has the required information to optimize `i++` to `++i`, and that such a proof is as trivial as for built-in types? – jxh May 04 '15 at 18:29
  • @jxh: *But, given the scope of my question, can you say the compiler has the required information to optimize i++ to ++i, and that such a proof is as trivial as for built-in types?* **Doesn't my answer address that already?** – Nawaz May 04 '15 at 18:31
  • Your answer says it may or may not matter, and that the complexity of the object determines when it does. It does not answer if my particular case is considered too complex or not. – jxh May 04 '15 at 18:34
  • @jxh: Which compiler should I assume to answer your question? And how would I know if *that particular* compiler does that transformation or not? My answer cannot possibly answer in this manner, though it says what is trivial or what is not; what is *easy* and what is *difficult*. And as you've seen yourself that `++i` and `i++` emit *identical code*. Don't these pieces together answer your question? – Nawaz May 04 '15 at 18:40
  • I think I would have been satisfied with a statement that the information is sufficient for the compiler to make the determination. However, our discussion got me to think there is one more piece that is missing, which is the complexity of the destructor. I do believe that if the copy constructor and destructor are both trivial, and the post increment is idiomatic, the compiler can easily perform the analysis alluded to in your answer. – jxh May 04 '15 at 18:43
  • @jxh: Copy-constructor/Destructor and their complexities are the last things I'd discuss in this case. – Nawaz May 04 '15 at 18:47
  • 1
    Since the postfix implementation delegates the actual work to the prefix operator, the only things left are the copy and its destruction, right? – jxh May 04 '15 at 18:57
11

Typically the post-increment operator in user defined types involved creating a copy which is slower and more expensive than the typical pre-increment operator.

Therefore the pre-increment operator should be used in preference for user-defined types.

Also it is good style to be consistent and therefore pre-increment should also be preferred with built in types.

Example:

struct test
{
    // faster pre-increment
    test& operator++() // pre-increment
    {
        // update internal state
        return *this; // return this
    }

    // slower post-increment
    test operator++(int)
    {
        test c = (*this); // make a copy
        ++(*this); // pre-increment this object
        return c; // return the un-incremented copy
    }
};

The compiler can not be expected to optimise post-increment for user defined types as their implementation is a convention, not something the compiler can deduce.

Galik
  • 47,303
  • 4
  • 80
  • 117
  • 1
    Consistency is also important. – jxh May 04 '15 at 18:44
  • Though with the code in this answer, I believe a compiler would be smart enough to optimize it, if it can inline the post-increment operator call, the return value isn't used and its constructor and destructor are known to not have side effects. So I believe consistency is the main reason to prefer pre-increment. I'd say that a class where one is considerably slower than the other is a poorly designed class. – Zyx 2000 May 04 '15 at 21:23
4

Besides being potentially more efficient, the main reason why you should (usually) prefer pre-increment over post-increment is that the former is what you really meant in the first place.

When you write a loop header like

for ( std::size_t i = 0; i < numElements; i++ )

you don't mean "pls add one to the value of i and then give me its old value". You don't care about the return value of the expression i++ at all! So why make the compiler jump through hoops and give the one return value that takes the most work to get?

I realize the compiler will usually optimize the unnecessary extra work away anyway, but why not just say what you mean instead of hoping for the compiler to figure out what you mean?

antred
  • 3,697
  • 2
  • 29
  • 37
2

Optimizing compilers do all sorts of wonderful and magical things, especially when you're not running a debug build, but without getting into the internal details, a pre-increment operator applied on a user-defined type is still going to be as fast or faster while taking no more effort to write or maintain.

It's like you can get used to writing code like a>b ? a:b in place of using a max function, and optimizing compilers usually do emit branchless code in those cases. But what purpose does it serve, when we can just as easily, and arguably with greater clarity, write max(a, b)?

When you can achieve something that is as fast or faster with no extra effort or cost to maintainability than, at worst, a slight change in old stylistic habits, that's when I think we should stop looking to the optimizer for answers. The optimizer should be there to make things that actually originally took more effort and had a higher maintenance cost cheaper.

0

I have selected Nawaz's answer as the best. I generally agree with most comments and other answers that pre-increment should still be preferred. However, I wanted to understand how a compiler is able to determine that one could be treated semantically the same as the other. For sure, one could just say "It doesn't matter how, you shouldn't use post-increment." But, that answer doesn't really satisfy my intellectual curiosity.


It would appear that a compiler has enough information to treat a class like a built-in type if the copy constructor and destructor (implying any objects that it contains would also have trivial destructors) are both trivial, and the post-increment is idiomatic.

The idiomatic inlined post-increment and trivial copy constructor alone are not sufficient for the compiler to deduce that the two functions test_pre and test_post can be implemented identically. If the destructor is non-trivial, the code differs. Even with an empty destructor body, the post-increment assembly changes slightly for the compiler in question, GCC 4.4.7:

        .cfi_startproc
        .cfi_personality 0x3,__gxx_personality_v0
        .cfi_lsda 0x3,.LLSDA1106
        pushq   %rbx
        .cfi_def_cfa_offset 16
        .cfi_offset 3, -16
        movq    %rdi, %rbx
.LEHB0:
        call    _ZN3IntppEv
.LEHE0:
        movq    %rbx, %rax
        popq    %rbx
        .cfi_remember_state
        .cfi_def_cfa_offset 8
        ret
.L12:
        .cfi_restore_state
.L9:
        movq    %rax, %rdi
.LEHB1:
        call    _Unwind_Resume
.LEHE1:
        .cfi_endproc

Observe that the execution path is largely the same, except for some extra .cfi_* statements that does not appear in the pre-increment version, as well as the unreached call to _Unwind_Resume. I believe the extra code was added to deal with the case the destructor throws an exception. Dead code removal cleaned some of it up, since the destructor body was empty, but the result was not identical code to the pre-increment version.

Community
  • 1
  • 1
jxh
  • 69,070
  • 8
  • 110
  • 193