How to translate dict(zip(range(n), range(n))) to Python 3?

Question

I just ran 2to3 on code that looks like this (A):

def idict(n):
    return dict(zip(range(n), range(n)))

and it generated this (B):

def idict(n):
    return dict(list(zip(list(range(n)), list(range(n)))))

both dict and zip can consume iterators, so why this translation?

B seems to be very slow too. Testing with

python -m timeit -s "import B as t" "t.idict(10)"

with the following results:

________________A______B______C___  
Python 2.7.13   2.89   3.82   2.29
Python 3.5.1    2.63   4.34   A

ie. from 2.89 usec to 4.34 (+50%) with the default translation.

Questions.. (i) is there a reason I shouldn't use the original code in Python 3? (it produces the correct result, and seems reasonable to me); (ii) is 2to3 the correct tool (we need to run on both 2 and 3 while transitioning ~150KLOC of python)

Update: I've added dict(itertools.izip(xrange(n), xrange(n))) as algorithm C in the table.

Possible duplicate of [Why does 2to3 change mydict.keys() to list(mydict.keys())?](https://stackoverflow.com/questions/27476079/why-does-2to3-change-mydict-keys-to-listmydict-keys) — Jasper, Jun 08 '18 at 19:50
https://docs.python.org/2/library/2to3.html#2to3fixer-xrange — Jasper, Jun 08 '18 at 19:54
@Jasper I don't think it's a duplicate. The other question is about the list ctor being added to a `dict.keys()` call in a for-loop context. The reasoning might be similar, but not the same. The other question is also purely about code and not about the tools. — thebjorn, Jun 08 '18 at 20:13
@JoshLee it better be since one of the range calls is unneeded.. — thebjorn, Jun 08 '18 at 20:17

Jean-François Fabre · Accepted Answer · 2018-06-08T20:24:57.227

py2to3 doesn't see the global picture. It just creates some equivalent code, replacing the functions that now don't create lists anymore by adding a list wrapper, to make sure that:

one can subscript the result
one can iterate on the result as many times as wanted

(it also puts parentheses around print, ... but not relevant here)

So it tries to make your code run, but the performance isn't guaranteed like at all.

In your example, the list wrapper is useless, as the dict consumes the iterator.

So this tool is useful to make code work quickly, but should not be used without comparing to your original code and decide what to keep/what to change.

The tool could probably be improved to:

avoid wrapping when the iterator is used in a loop
avoid wrapping when the iterator is passed to an object which takes an iterable as input.

In your case

dict(zip(range(n), range(n)))

is perfectly fine and runs faster in python 3 than in python 2 because it avoids intermediate list creations, so leave it that way.

a python 2 equivalent of that would be slightly more complex:

dict(itertools.izip(xrange(n), xrange(n)))

My advice if you have a lot of code to translate (I've been there):

use python -3 switch with python 2 interpreter to expose your code and get some warnings instead of having it crash in python 3 (well, it is supposed to warn about Python 3.x incompatibilities that 2to3 cannot trivially fix, but it misses a lot of cases, well, it's better than nothing, for instance it finds the infamous has_key calls)
use py2to3 and compare the results with your original code, decide manually where to apply the changes
you can also use multi search/replace with tools like GrepWin to do what py2to3 would do, only with less risks of degrading the performance:
- search for iteritems, replace by items
- search for xrange, replace by range
- track down dict.has_key calls, unicode built-in
- I may forget some...
test and expose your code extensively with python 3. some things are invisible to the tool and the -3 option, like when you're using binary mode to read text files and such.

@thebjorn the tool is good, the error was originally to use range instead of xrange in python2, I believe — Olivier Melançon, Jun 08 '18 at 19:54
@OlivierMelançon `range()` is faster than `xrange()` on small lists in Py2 (at least it used to be). — thebjorn, Jun 08 '18 at 19:55
@thebjorn no tool will be able to guess that n is expected to be small here, though — Olivier Melançon, Jun 08 '18 at 19:56
@OlivierMelançon that's true I suppose, although not particularly helpful to me ;-) I've added the "correct" py2 algorithm to the table (it's the fastest). — thebjorn, Jun 08 '18 at 20:03
Thanks for the detailed answer and the advice, both are very much appreciated. — thebjorn, Jun 08 '18 at 20:18
my pleasure. I have ported more than 10000 lines of code (which is currently running on both python 2.7 and 3.5 right now) and it wasn't so hard to do. — Jean-François Fabre, Jun 08 '18 at 20:21
(the hardest part is when you're using `.pyd` files that don't exist anymore/yet for python 3, but that's very rare now) — Jean-François Fabre, Jun 08 '18 at 20:26

How to translate dict(zip(range(n), range(n))) to Python 3?

1 Answers1