1

I was fiddling around with a little function to create tuple arrays filled with 0 and went from

tuple(tuple(0 for _ in range(x)) for _ in range(y))

to

((0,) * x,) * y

and it tremendously improved performance and I was wondering why it improved so much.

My test code:

import timeit

statements = (
    '((0,) * x,) * y',
    'tuple(tuple(0 for _ in range(x)) for _ in range(y))',
)

for i in (3, 5, 7):
    setup = 'x = {}; y = {}'.format(i, i)
    print()
    for statement in statements:
        print(sum(timeit.repeat(statement, setup))/5)

results:

0.13270888
2.5220104

0.13536394000000024
4.04865446

0.1402734800000019
6.28384762

0.2838335400000034
37.955879499999995
Adrien Levert
  • 964
  • 6
  • 15

3 Answers3

2

Multiplying a tuple (or list) causes the same original contents to be repeated multiple times. Suppose for example we let z = ((0,) * x,), then z * y is a tuple that contains the same z object y times. This the cause of a common unexpected behaviour with lists (the problem cannot occur with tuples simply because they're immutable). This is fast because, behind the scenes, there is no recalculation and no copying of a large chunk of data, just pointer updates.

Comprehensions, generator expressions and calls to tuple/list create a distinct object each time they are evaluated, even if an identical result exists elsewhere. (This is important for lists because it allows you to avoid the common unexpected behaviour.) Comprehensions and generator expressions also evaluate their expression each time, even if the result is not dependent on the iteration variable. Thus tuple(0 for _ in range(x)) runs y times, creating a separate tuple each time.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
1

For all practical purposes - what good does it you if you add the same tuple multiple times into another tuple - you cannot change them, could as well use a single tuple constant whereever you would want to use this tuple of tupleness...

k = ((0,) * 10,) * 20

print(*(id(x) for x in k))

Output:

2259918835264 2259918835264 2259918835264 2259918835264 2259918835264 
2259918835264 2259918835264 2259918835264 2259918835264 2259918835264 
2259918835264 2259918835264 2259918835264 2259918835264 2259918835264 
2259918835264 2259918835264 2259918835264 2259918835264 2259918835264
Patrick Artner
  • 50,409
  • 9
  • 43
  • 69
  • In my code, it's actually: `def create_empty_array(width, height): return ((0,) * width,) * height`. I use it as a base array to make a (width * height) grid for a pacman map recreation and the optimization was not an effective use of my time in the practical way, but in a theoretical way. – Adrien Levert Sep 27 '21 at 20:57
0

The second expression iterates over each cell so the number of cells grows quadratically in function of the size and so does computation time.

The first expression simply multiplies how many 0 there are in a tuple and then does it again but now with the inner tuple.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
Adrien Levert
  • 964
  • 6
  • 15