Understanding more about i++ and i=i+1

Question

I was wondering if there is difference between the two forms of increment. Some of the links says i++ is faster that i=i+1;

Also as one of the person my observation is also the same for assembly code. please check the image where both the assembly code are same for i++ and i=i+1 - enter image description here

There is another link that says that it used to be true previously that increment operator was faster than addition and assignment, but now compilers optimize i++ and i=i+1 the same.

Is there any official document/paper which we can refer to confirm what is exactly right ? (I usually go with credit's and number of accepted answers of a person on stackoverflow. Could't find any such thing on the links I provided).

It really depends upon the way the compiler handles it. I don't think there is any official paper suggesting this behavior. — iqstatic, Oct 05 '14 at 04:23
The reason in your `link` for higher performance involve two steps of `ADD` & `Assignment`. Yes, `Assignment` can be a time consumption but in this case it wouldn't matter because the variable to be assigned with `i + 1` is already available in the register. So, the [`write allocate`](http://www.cs.iastate.edu/~prabhu/Tutorial/CACHE/interac.html) doesn't result in performance damage. — DOOM, Oct 05 '14 at 04:24
I would think most compilers would optimize them to be the same, but in some instruction sets, Increment is atomic but assignment and addition takes two calls. — Novaterata, Oct 05 '14 at 04:29
In *both* these cases they've been optimized to effectively `++i`, which has decidedly different sequencing characteristics than `i++`, but the *same* expression value as `i=i+1`. I suspect things would be considerably different had you used them as rvalues, and the optimizer detecting they were *not* used as-such was given the green light to reduce them both to `++i`. I would expect it in the latter, but you should only see it with the former if it is *not* used as an rvalue. — WhozCraig, Oct 05 '14 at 04:36
The best source there is is to check the assembly code, and you're doing that. This is debug code though, so fairly meaningless. — harold, Oct 05 '14 at 08:06
Oh, and if you get this question in an interview, just walk out, just know that your interviewer has no clue. — n. m. could be an AI, May 28 '16 at 19:26
There's a subtle difference between incrementing and adding a constant that happens to be one. For example if you are doing a football league, it should be points +=3 for a win and points += 1 for a draw, not points++ for a draw. — Malcolm McLean, May 22 '17 at 12:42

score 15 · Answer 1 · edited May 23 '17 at 12:18

15

Actually, if you do C++, it is much better to get used to write ++i instead. The reason is simple: i++ requires a copy.

a = ++i; // a is set to the result of i+1
a = i++; // make a copy of i, compute i+1, save the copy of i in a

Without optimizations, the assembly code would look like this:

a = ++i;                            a = i++;

MOV eax, (i)                        MOV eax, (i)
                                    PUSH eax
ADD eax, 1                          ADD eax, 1
MOV (i), eax                        MOV (i), eax
                                    POP eax
MOV (a), eax                        MOV (a), eax

Now, with optimizations, the result is the same in C where the ++ operator only applies to integers and pointers.

The ++ and -- are there because most processors had an INC and a DEC instruction at the time C was written. So if you were to use an index register, these instructions would be applied:

char a[256];
...init 'a' in some way...
int sum =0;
for(int i = 0; i < 100; ++i)
{
    sum += a[i];
}

This could be done with a simple INC as in (6502):

    LDA #00
    LDY #00
LOOP:
    CLC
    ADC ($80),Y
    INY              <-- ++i or i++
    CPY #100
    BCC LOOP

Note that in C we have another notation to increment a variable:

i += 1;

This is practical if you need to increment the register by more than 1:

i += 3;

Or to double the register each time:

i += i;   // (equivalent to  i *= 2;  or  i <<= 1;  in C++)

Question: Why is INC and DEC not used with all 80x86?

There has been time when the ADD reg, 1 and SUB reg, 1 instructions were faster than the INC reg and DEC reg. In the old days, it was faster because the instruction was smaller and we had no cache (or very little). Today, either instruction is probably about the same.

From a comment below, a reason for the "slowness" was the FLAGS register:

Intel Optimization Reference, section 3.5.1.1 Use of the INC and DEC Instructions

From another, more current comment, it looks like the slowness referenced in that Intel document has been fixed in newer processors. So the use or non-use of these instructions should depend on the target processor if known in advance.

As pointed out by phresnel in a comment, the difference between i++ and ++i was probably not clear for many people. For an integer, the optimization is really trivial and will most certainly happen even with -O0. However, in C++, that's a different story. There is a class with both increment operators clearly showing that a copy is required for i++ (even if data_ is just an integer, although in that case you could also do: return data_++ -- it still requires a well hidden copy!):

class A
{
public:
    A& operator ++ () // ++i -- no copy
    {
        ...apply the ++ operation to 'data_'...
        return *this;    // return a reference to this
    }

    A  operator ++ (int) // i++ -- needs a temporary copy
    {
        // remember that the 'int' is totally ignored in the function,
        // its only purpose is to distinguish '++i' from 'i++'

        A copy = *this;    // here we need a copy
        ++*this;
        return copy;       // and here we return said copy
    }

private:
    some_type_t   data_;
};

Note that modern C++ compilers do not make two copies in the i++ function as the returned value can be optimized out without the need of an extra copy.

The difference between both cases can be shown as clearly slower if using i++ as described in Is there a performance difference between i++ and ++i in C++? (link mentioned by phresnel)

edited May 23 '17 at 12:18

Community

1
1

answered Oct 05 '14 at 04:37

Alexis Wilke

19,179
10
84
156

"in C where the ++ operator only applies to integers" -- and pointers (and real floating types). Further suggest omitting C++ syntax in `int sum(0);` and `for(int i(0); ...)`. And the equivalences in the last snippet also hold for C (the second one is only equivalent for non-negative `i`). – mafso Oct 05 '14 at 05:42
@mafso, I thought `++` and `--` were not allowed on floating points. But indeed, pointers support them! The fact is in C++ you can overload those operators and the results are much faster when put on the left side if possible. Now... `i *= 2` is the same as `i += i`, even if `i` is negative. – Alexis Wilke Oct 05 '14 at 07:36
I was surprised, too, but see C11 (draft) 6.5.3.1 p1 (that's a constraint section, you can also just try it with a conforming compiler). I'm not sure if this should be considered good style, I just mentioned it for completeness. I got your point about C++ and why you prefer prefix increment/decrement, I'm talking about the comment in the last line of code: "equivalent to `i *= 2;` or `i <<= 1;` in C++". I'm confused about the "in C++" there (the equivalence holds for C, too). And left-shifting negative numbers (even by 0, strictly speaking) is UB, but `i+i` is defined for some negative `i`. – mafso Oct 05 '14 at 07:58
2

The reason `INC`/`DEC` should be replaced with `ADD`/`SUB` is "because ADD and SUB overwrite all flags, whereas INC and DEC do not, therefore creating false dependencies on earlier instructions that set the flags." see [Intel Optimization Reference, section 3.5.1.1 Use of the INC and DEC Instructions](http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html) – amdn Oct 05 '14 at 08:04
@mafso, The shift was not available in C when I was still using that language. I'm also surprised that the shift would be undefined. Most compilers will transform `i * 2` in `SHL eax, 1`... (or whatever corresponds to your variable type.) – Alexis Wilke Oct 05 '14 at 08:11
The UB may origin from non-2's complement (what the C standard still addresses), but now that it is UB, compilers are free to optimize e.g. `n<<0 >= 0` to `1` (I could get neither Gcc nor Clang to do things like that, though), the mapping to assembly isn't the only problem with UB. FYI (I've just looked it up), C and C++ differ slightly here, C++ makes left-shifting a _positive_ number into the sign bit (but not beyond) implementation-defined, but otherwise, they agree: Left-shifting negative signed integers is always UB. – mafso Oct 05 '14 at 09:09
This is complete nonsense. `a= i++;` causes i to be incremented and copied to a. `a= ++i;` causes i to be copied to a and be incremented. (It may even be the case that i and a be merged into a single variable by the optimizer.) – Oct 05 '14 at 19:48
@YvesDaoust, no, the other way around. `++i` first increments `i` then saves the result in `a`. And vice versa for the other one. And yes, in C it generally does not make much of a difference, although in very complex expressions, the compiler may not be able to optimize as much. Also both expressions set a different value in `a` so they are not equivalent. – Alexis Wilke Oct 05 '14 at 21:29
Yep: i++ requires no copy. These upvotes are unearned. – Oct 06 '14 at 06:25
The argument should not be "Without optimizations, the assembly code would look like this:", because nobody who's sane in their mind disables optimizations in a release. The argument should instead involve semantics and a side-effect demonstration about when ++() is really superiour to ++(int) and vice versa. – Sebastian Mach May 28 '16 at 16:42
@phresnel How do you think we can better and better optimize compilers if it is not by checking out the final optimizations? Some super critical code needs us to do that. More and more rarely, maybe, but once in a while the compiler will do it wrong or our loops are slightly too long and in order to find yet another optimization, you have to look at the final code... – Alexis Wilke May 28 '16 at 18:03
@AlexisWilke: Checking out the final optimizations is good if you need to. But most code should be written with _meaning_ in mind, which is the point of using a higher level language. The removal of an unused value and the propagation of this semantic void up the code tree is pretty basic. If you really want to give a meaningful explanation why `++i` should be preferred over `i++` in case the value before incrementation is not required, maybe look here (http://stackoverflow.com/a/9519095/76722), which contains some examples of non O(1) copying algorithms. – Sebastian Mach May 28 '16 at 18:28
@phresnel, Thank you for the link. That may help people like Yves Daoust to better understand my argument. Only, showing the result of the compiler with that `Something` class rather than just an `int` is going to take pages of assembly code and does not really add to my example. The very simple `PUSH` and `POP` represent the copy (rather trivial here, I agree), in a way I think most people can easily understand that there is something more that happens when you use `i++`. – Alexis Wilke May 28 '16 at 18:36
@phresnel, also the sample implementation of `operator(int)` is good in that linked answer. It clearly shows that you have to make a temporary copy of the object before you apply the increment and then return that copy. – Alexis Wilke May 28 '16 at 18:38
@AlexisWilke: Yes, that's an example for when the compiler has no insight into the implementation of operator++() / operator++(int). In case of `int i; ++i;` the compiler has insight. – Sebastian Mach May 28 '16 at 18:41
1

@amdn: [The suggestion to avoid `inc`/`dec` is obsolete](http://stackoverflow.com/q/36510095/224132). They only have a false dependency on the flag they don't modify (`CF`) on P4, not on other microarchitectures. Even Intel's own compiler emits `inc` in many cases instead of `add reg, 1`. Intel is just slow to update their optimization manual. – Peter Cordes May 28 '16 at 19:06

score 5 · Answer 2 · answered Oct 05 '14 at 04:24

There is no official document. The c spec does not declare that i++ must be faster than i+1 so compilers/optimizers are free to do what they like (and they can make different choices based on surrounding code and optimization level).

I use i++ because it's faster for me to read, with fewer characters to mistype.

score 3 · Answer 3 · answered Oct 05 '14 at 08:45

Runthe code profiler between both of them it will be same for both i++ and i = i+1 for current version of gcc compiler. This purely depends on the compiler optimizations.

If you speak about processor specifically you can see the below chart for machine cycles,

INC - Increment

    Usage:  INC     dest
    Modifies flags: AF OF PF SF ZF

    Adds one to destination unsigned binary operand.

                             Clocks                 Size
    Operands         808x  286   386   486          Bytes

    reg8              3     2     2     1             2
    reg16             3     2     2     1             1
    reg32             3     2     2     1             1
    mem             15+EA   7     6     3            2-4  (W88=23+EA)

ADD - Arithmetic Addition

    Usage:  ADD     dest,src
    Modifies flags: AF CF OF PF SF ZF

    Adds "src" to "dest" and replacing the original contents of "dest".
    Both operands are binary.

                             Clocks                 Size
    Operands         808x  286   386   486          Bytes

    reg,reg           3     2     2     1             2
    mem,reg         16+EA   7     7     3            2-4  (W88=24+EA)
    reg,mem          9+EA   7     6     2            2-4  (W88=13+EA)
    reg,immed         4     3     2     1            3-4
    mem,immed       17+EA   7     7     3            3-6  (W88=23+EA)
    accum,immed       4     3     2     1            2-3

You can find the machine cycles taken for each instruction ADD and INC for various processors, 8086, 80286, 80386, 80486,above, you can find this documented in intel processor manuals. Lower the machine cycles faster the response.

As mentione by amdn below, the `INC` and `DEC` are slower on modern processors which reorganize many instructions but are limited in that reorganization when an instruction does not change all the flags in CFLAGS. The instruction pipeline has to be drained quite a bit before the processor can go on. That being said, the speed is indeed going to be the same in most cases because the optimizer will generate code which is exactly the same in both cases. — Alexis Wilke, Oct 05 '14 at 21:35

score -1 · Answer 4 · answered Oct 08 '14 at 07:46

-1

rmv=10;

rmv=rmv++;//rmv+1 both are same

printf("1.time=%d",rmv);

//output is 10 than second time increment 10+1 so,value is11

printf("2.time=%d",rmv++);//value is 11

answered Oct 08 '14 at 07:46

R.M.VIVEK Arni

1
3

Understanding more about i++ and i=i+1

4 Answers4

Linked