0

I ran some quick benchmarking using timeit.repeat comparing two different ways to use an _accumulator_.

def testAccumPlusEqual():
    x = 0
    for i in range(100):
        x += 1
    return x

def testAccumEqualPlus():
    x = 0
    for i in range(100):
        x = x + 1
    return x

My implementation of timeit.repeat is:

if __name__ == '__main__':
    import timeit
    print(timeit.repeat("testAccumPlusEqual()",
                    setup="from __main__ import testAccumPlusEqual"))
    print(timeit.repeat("testAccumEqualPlus()",
                    setup="from __main__ import testAccumEqualPlus"))

The results are found below:

>>> 
[8.824021608811469, 8.80440620087051, 8.791231916848997]
[8.101681307351758, 8.143080002052649, 8.181129610882778]

Granted, in the grand scheme of things, this time difference may not be noticeable, but if used in large scale it could lead to a slow down. So I guess I'm really asking:

From everywhere I've seen, the de facto standard is to accumulate with +=, but should that be the case still?

Why would += perform worse than x=x+?

NOTE: Using CPython 3.3.2 on Windows 7 64bit(using 32bit version of python)

wpg4665
  • 131
  • 1
  • 9
  • I think `+=` is an in-place accumulation whereas doing `x = x + y` creates a new object in memory with the value `x+y` and just reassigns the name `x` to it. Thus `x = x + y` has 2 steps: storing value of `x + y` in some register and reassignment of x. `x += y` also 2 steps: Store value of `x + y` in some object. Then copy that value over to the location of x. It's possible that COPYING the value to the location of the object pointed to by `x` is a more costly operation than simply "reassigning" the pointer value of `x` itself. But hey, I know nothing about Python so this is all my guess. – Shashank Sep 15 '13 at 17:32
  • Running the same code I got [8.69090794257174, 8.65746124451764, 9.022020214102863] [9.061213666780041, 9.197347861298582, 9.04849989044235] which seems to suggest `+=` is faster. – rlms Sep 15 '13 at 17:35
  • @ShashankGupta That depends on the object, for integers both of them return a new object. For lists `+=` is an in-place operation, while the other one returns a new list. – Ashwini Chaudhary Sep 15 '13 at 17:38
  • My results running exactly your code: `C:\love\games\python>python test.py [7.134581133753686, 4.640032242683925, 4.645418994532854] [4.683140368994856, 4.762817429009434, 8.111445511360838]` – Shashank Sep 15 '13 at 17:38
  • @AshwiniChaudhary Ah, thank you. I did not know. I was using this as reference: http://www.python.org/dev/peps/pep-0203/ It says "They implement the same operator as their normal binary form, except that the operation is done `in-place' when the left-hand side object supports it, and that the left-hand side is only evaluated once." It did not specify that the in-place assignment is only done for certain types of objects. – Shashank Sep 15 '13 at 17:41
  • 1
    @ShashankGupta That depends on what does the `__iadd__` and `__add__` methods of those objects do, for immutable objects they always return new object and for mutable objects they can behave differently.(Integers don't have an `__iadd__` method.) [Is the behaviour of Python's list += iterable documented anywhere?](http://stackoverflow.com/questions/13904493/is-the-behaviour-of-pythons-list-iterable-documented-anywhere)... [When is “i += x” different from “i = i + x” in Python?](http://stackoverflow.com/questions/15376509/when-is-i-x-different-from-i-i-x-in-python) – Ashwini Chaudhary Sep 15 '13 at 17:50
  • @AshwiniChaudhary So `__iadd__` means in-place add right? And I guess in Python, numbers are immutable, so they would have the exact same implementation for `__iadd__` and `__add__`. This all makes sense now. :) Thank you! – Shashank Sep 15 '13 at 17:53
  • 1
    @ShashankGupta Actually immutable objects like int, str, ..don't have `__iadd__` method, so `+=` falls back to `__add__`. Yes it means in-place add. – Ashwini Chaudhary Sep 15 '13 at 17:54
  • @AshwiniChaudhary could the "fall back" be the source of the extra lag? Looking up __iadd__ then having to look up __add__, or would it be optimized to skip right to __add__?? – wpg4665 Sep 16 '13 at 14:34
  • @user2387370 what python and machine specs are you using to run to test?? – wpg4665 Sep 16 '13 at 14:35
  • @ShashankGupta what python and machine specs are you using to run to test?? – wpg4665 Sep 16 '13 at 14:42
  • @wpg4665 CPython 2.7.5 on Windows 7 64bit OS (64 bit version of Python) – Shashank Sep 16 '13 at 15:17
  • @wpg4665 CPython 3.3.2 on Windows XP 32bit – rlms Sep 16 '13 at 18:26
  • @ShashankGupta I wonder if the older version could be the source of difference? – wpg4665 Sep 16 '13 at 18:35

1 Answers1

2

It's not actually an answer, but it could help you to understand what happens in you Python code. You can call dis on both of functions and get:

>>> import dis
>>> dis.dis(testAccumEqualPlus)
  2           0 LOAD_CONST               1 (0)
              3 STORE_FAST               0 (x)

  3           6 SETUP_LOOP              30 (to 39)
              9 LOAD_GLOBAL              0 (range)
             12 LOAD_CONST               2 (100)
             15 CALL_FUNCTION            1
             18 GET_ITER            
        >>   19 FOR_ITER                16 (to 38)
             22 STORE_FAST               1 (i)

  4          25 LOAD_FAST                0 (x)
             28 LOAD_CONST               3 (1)
             31 BINARY_ADD          
             32 STORE_FAST               0 (x)
             35 JUMP_ABSOLUTE           19
        >>   38 POP_BLOCK           

  5     >>   39 LOAD_FAST                0 (x)
             42 RETURN_VALUE        
>>> dis.dis(testAccumPlusEqual)
  2           0 LOAD_CONST               1 (0)
              3 STORE_FAST               0 (x)

  3           6 SETUP_LOOP              30 (to 39)
              9 LOAD_GLOBAL              0 (range)
             12 LOAD_CONST               2 (100)
             15 CALL_FUNCTION            1
             18 GET_ITER            
        >>   19 FOR_ITER                16 (to 38)
             22 STORE_FAST               1 (i)

  4          25 LOAD_FAST                0 (x)
             28 LOAD_CONST               3 (1)
             31 INPLACE_ADD         
             32 STORE_FAST               0 (x)
             35 JUMP_ABSOLUTE           19
        >>   38 POP_BLOCK           

  5     >>   39 LOAD_FAST                0 (x)
             42 RETURN_VALUE

As you see, the only difference is INPLACE_ADD for += and BINARY_ADD for = .. +

Roman Pekar
  • 107,110
  • 28
  • 195
  • 197