0

When iterating a list of lists in python 2.7.3 I noticed performance differences when changing the order of the iteration:

I have a list of 200 lists of 500000 strings. I then iterate in the following ways:

numberOfRows = len(columns[0])
numberOfColumns = len(columns)

t1 = time.clock()
for i in xrange(numberOfRows):
    for j in xrange(numberOfColumns):
        cell = columns[j][i]
print time.clock() - t1

t1 = time.clock()
for i in xrange(numberOfColumns):
    for j in xrange(numberOfRows):
        cell = columns[i][j]
print time.clock() - t1

The program repeatedly produces outputs similar to this:

33.97
29.39

Now I expected to have efficient random access on the lists. Where do these 4 seconds come from; is it only caching?

Community
  • 1
  • 1
jzwiener
  • 8,182
  • 4
  • 21
  • 23
  • 2
    You should be using `timeit` for things like this – jamylak Apr 17 '13 at 07:30
  • Also how many columns and rows, respectively, do you have? That's probably the reason, since it has nothing to do with efficient random access. – jamylak Apr 17 '13 at 08:03
  • Have you tried replacing the `cell = ...` lines with `pass`, just to see how much time the creation of `xrange` objects takes? – Reinstate Monica Apr 17 '13 at 08:13
  • You're not testing **random** access, you're iterating either columns then rows or rows then columns. – MattH Apr 17 '13 at 08:28
  • I have 200 columns and 500000 rows. If I just pass I get numbers similar to: 7.67 8.15 I know that I am iterating through columns and rows, but as lists provide efficient random access I expected both iterations to take the same amount of time. – jzwiener Apr 17 '13 at 08:29
  • Efficient random access does not mean that all iteration paths will take the same time. Depending on the size of the datastructures, the python implementation, the method of structure traversal and the size of the CPU caches, the you may observe speed differences due to [Locality of Reference](http://en.wikipedia.org/wiki/Locality_of_reference) – MattH Apr 17 '13 at 09:26
  • yes - I already suspected that caching causes the measured time differences when I asked the question. – jzwiener Apr 17 '13 at 13:21

1 Answers1

1

I get something like

30.509407822896037
29.88344778700383

for

columns = [[0] * 500000 for x in range(200)]

If I replace the cell = ... lines with pass, I get

8.44722739915369
10.23647023463866

So it's definitely not an issue with creating the xrange objects or something alike.

It's the caching (not by Python, by the computer) of the columns: If I use

columns = [[0] * 500000] * 200

I get

27.725353873145195
29.592749434295797

Here, always the same column object is used, and there is (almost) no difference in caching. Thus (about) the same timing difference as in the pass variant shows.

Reinstate Monica
  • 4,568
  • 1
  • 24
  • 35
  • Please elaborate on *It's the caching (not by Python, by the computer)*. – MattH Apr 17 '13 at 08:53
  • 1
    @MattH: I am not aware of Python caching objects. The processors cache will work better if the elements of one column are used one after the other (they are stored next to each other) than if the 200 columns (that are located far away in memory) are used all at once. (At least that's what I *think*.) – Reinstate Monica Apr 17 '13 at 10:18