which is faster: i=i+2 or i+=2?

Question

Consider the below code snippet:

for(i=0;i<10;i+=2) // 1
for(i=0;i<2;i=i+2) // 2

Which one will be better to use?
Does it make any difference in the performance?

@JensGustedt could you explain in more detail? you meant the 2 vs 10? — Gir, Aug 11 '12 at 15:35
I supposed that the grammar rules compute directly **i=i+2** and need one transition for **i+=2**, so I would say i=i+2 is faster, but not sure it be the right explaination ... — Jérôme Boé, Aug 11 '12 at 15:36
With any halfway decent compiler, the increment will be the same speed either way (i.e., `x+=n;` and `x=x+n;` will produce *identical* code). There *might* be an exception if you defined the variable as `volatile`, but that's sufficient unusual that it's barely worth discussing. — Jerry Coffin, Aug 11 '12 at 15:37
Do you really mean to compare two loops with different upper limits? Either way, it may be better to either edit the question or mention this intention explicitly. — Levon, Aug 11 '12 at 15:48
Let me again link to [this example](http://stackoverflow.com/a/11639305/597607), where 10+ lines of code result in 5 machine instructions. Don't fiddle with low-level optimizations - the compiler is *much* better at that. — Bo Persson, Aug 11 '12 at 16:11

score 7 · Answer 1 · answered Aug 11 '12 at 15:43

7

The following took 0.0260015 seconds

for (i = 0 ; i < 10000000 ; i += 2)

And this took 0.0170010

for (i = 0 ; i < 10000000 ; i = i + 2)

@MasterID is right though when I enabled 'optimize code' both reported 0.0150009 seconds

answered Aug 11 '12 at 15:43

Code Uniquely

6,356
4
30
40

1

you need to run it multiple times. the difference could be due to other stuff running in the background and OS interrupts – Gir Aug 11 '12 at 15:45
3

I don't doubt your timings, but I'm pretty sure they don't show that the latter code is actually slower. Under the only two compilers I have at hand (gcc and clang), both produce exactly the same assembly code. You're almost certainly measuring noise. – DSM Aug 11 '12 at 15:45
1

Wow, +1, finally somone really tested it :D – Bugari Aug 11 '12 at 15:47
And when I got round to turning on optimization the code produced was identical on both which supports what @jdehaan was saying ... – Code Uniquely Aug 11 '12 at 15:49
and tried to keep noise to a minimum by running the code inside a tight loop 50x and averaging. :D – Code Uniquely Aug 11 '12 at 16:06

score 5 · Accepted Answer · answered Aug 11 '12 at 15:39

There is no definite answer to your question. It depends on how smart your compiler is among other things (optimization level, ...) and on the target platform. This is not a C language question. The language is not more or less performant by itself. It just depends on what the compiler builds out of it. So test it for your use case if performance matters at all...

Otherwise my advice, just write it in the way you feel it more readable.

score 3 · Answer 3 · answered Aug 11 '12 at 15:41

3

The first option is as fast as the second, at least. Although any compilation optimization would generate the same assembly code.

answered Aug 11 '12 at 15:41

MasterID

1,510
1
11
15

score 3 · Answer 4 · answered Aug 11 '12 at 16:01

3

Both express the exact same semantics, i.e. the exact same effect in the abstract machine of the C language. If one is slower than the other, it's a quality-of-implementation flaw in your compiler.

answered Aug 11 '12 at 16:01

R.. GitHub STOP HELPING ICE

208,859
35
376
711

Some hardware platforms process register reads that are part of a read-modify-write operation differently from those that aren't. I would expect a quality implementation intended for low-level programming on such a platform to process e.g. `myPORT |= 1` using a single read-modify-write instruction, but process `myPORT = myPORT | 1;` using separate read and write operations to allow for the possibility that code might need the latter semantics for some reason, and that would be the most natural way to request them. – supercat Jul 16 '18 at 18:18
@supercat: I'm not clear whether it would be conforming for an implementation to do that with volatile objects; with non-volatile ones it makes no sense at all since there's no reason to expect any memory access at all. In any case the most natural way to request such semantics is with an atomic object, not secretly relying on a particular compiler's weird interpretation of formally identical expressions. – R.. GitHub STOP HELPING ICE Jul 16 '18 at 18:36
The precise semantics of `volatile` objects are considered "Implementation Defined", and I really doubt the authors of the Standard intended to forbid implementations from specifying that certain specific compound assignment expressions will be processed in specific ways. The place this issue comes up is on systems which assign one address to two registers, and process "stand-alone" read instructions as accessing one, and both write and read-modify-write instructions as accessing the other. I dislike such designs, but some chips work that way. – supercat Jul 16 '18 at 19:01
On e.g. an 8051, the only way to process "P0 |= 1" so it behaves as "P0outputlatch = P0outputlatch | 1" (which is what compound assignment would normally suggest) is to use a read-modify-write instruction. A standalone read of "P0", however, will yield the value of a "P0inputlatch" register. If for some reason code wants to read the value of P0inputlatch, set the LSB, and write it to P0outputlatch, such an operation would seem sensible for the code "P0 = P0 | 1;", but not for "P0 |= 1;". I doubt any 8051 compilers support C11 atomics, so they're a non-issue. – supercat Jul 16 '18 at 19:05
@supercat: 6.5.16.2 ¶3 says "A compound assignment of the form E1 op = E2 is equivalent to the simple assignment expression E1 = E1 op (E2), except that the lvalue E1 is evaluated only once..." so I'm not convinced that implementation-definedness of the details of volatile semantics override the requirement that the two forms be equivalent. – R.. GitHub STOP HELPING ICE Jul 16 '18 at 20:30
There are a number of cases where the Standard says two constructs are equivalent, but compilers actually treat them differently. For example, given `union { uint16_t h[4]; uint32_t w[2];} u;` gcc will recognize that `u.h[i]` might access the same storage as `u.w[j]`, but it will not recognize that `*(u.h+i)` might access the same storage as `*(u.w+j)`. If the Standard says two constructs are equivalent, it means that all requirements it imposes upon one apply to the other, and says nothing about behaviors they may supply beyond those required by the Standard. – supercat Jul 16 '18 at 20:36
@supercat: That's simply a case of undefined behavior. – R.. GitHub STOP HELPING ICE Jul 16 '18 at 20:40
The Standard makes no attempts to forbid conforming-but-useless implementations, and I would regard one which fails to recognize that lvalues which are derived from a union object and used *immediately* might access the union object itself (and thus other members) should be classified as "maybe conforming but definitely of low quality". Nonetheless, I think the principle that equivalence only applies to behavioral aspects *that are otherwise mandated by the Standard* applies in situations involving Implementation-Defined behaviors beyond those described by the Standard. – supercat Jul 16 '18 at 20:58

which is faster: i=i+2 or i+=2?

4 Answers4