6

In Python, I really enjoy how concise an implementation can be when using list comprehension. I love to do concise list comprehensions this:

myList = [1, 5, 11, 20, 30, 35] #input data
bigNumbers = [x for x in myList if x > 10]

However, I often encounter more verbose implementations like this:

myList = [1, 5, 11, 20, 30, 35] #input data
bigNumbers = []
for i in xrange(0, len(myList)):
    if myList[i] > 10:
        bigNumbers.append(myList[i])

When a for loop only looks through one data structure (e.g. myList[]), there is usually a straightforward list comprehension statement that is equivalent to the loop.
With this in mind, is there a refactoring tool that converts verbose Python loops into concise list comprehension statements?


Previous StackOverflow questions have asked for advice on transforming loops into list comprehension. But, I have yet to find a question about automatically converting loops into list comprehension expressions.


Motivation: There are numerous ways to answer the question "what does it mean for code to be clean?" Personally, I find that making code concise and getting rid of some of the fluff tends to make code cleaner and more readable. Naturally there's a line in the sand between "concise code" and "incomprehensible one-liners." Still, I often find it satisfying to write and work with concise code.

Community
  • 1
  • 1
solvingPuzzles
  • 8,541
  • 16
  • 69
  • 112
  • That's what I would do too. Unfortunately, I see a lot of unnecessarily verbose code like this. Are there refactoring tools that would replace `xrange(0, len(myList))` with `enumerate(myList)`? This would be especially useful when trying to clean up someone else's code, or trying to convert some messy code into something that's usable in a tutorial. – solvingPuzzles Jan 25 '13 at 07:52
  • 1
    @AshwiniChaudhary, or just use `for elem in myList:`. – jimhark Jan 25 '13 at 08:08
  • 1
    `What is Pythonic? "for i in range(len(seq)):"? No. Use "for obj in seq:".` – PaulMcG Jan 25 '13 at 08:18
  • @solvingPuzzles I did some googling but can't seem to find anything that modifies the source in a way that while loops are changed into for-loop or LC. Everything was related to profiling and static code analysis. – Ashwini Chaudhary Jan 25 '13 at 08:20
  • 1
    I love writing concise code too, but I also want to keep it readable. It's not hard to convert a loop into a list comprehension by hand, I wouldn't trust a tool to do it. – Martijn Pieters Jan 25 '13 at 08:47
  • 4
    One way of doing that _automatically_ would be to use `urllib` to post a question on SO, wait 2-3 minutes and then download the answer. – georg Jan 25 '13 at 09:51

1 Answers1

5

2to3 is a refactoring tool that can perform arbitrary refactorings, as long as you can specify them with a syntactical pattern. The pattern you might want to look for is this

VARIABLE1 = []
for VARIABLE2 in EXPRESSION1:
    if EXPRESSION2:
        VARIABLE1.append(EXPRESSION3)

This can be refactored safely to

VARIABLE1 = [EXPRESSION3 for VARIABLE2 in EXPRESSION1 if EXPRESSION2]

In your specific example, this would give

bigNumbers = [myList[i] for i in xrange(0, len(myList)) if myList[i] > 10]

Then, you can have another refactoring that replaces xrange(0, N) with xrange(N), and another one that replaces

[VARIABLE1[VARIABLE2] for VARIABLE2 in xrange(len(VARIABLE1)) if EXPRESSION1]

with

[VARIABLE3 for VARIABLE3 in VARIABLE1 if EXPRESSION1PRIME]

There are several problems with this refactoring:

  • EXPRESSION1PRIME must be EXPRESSION1 with all occurrences of VARIABLE1[VARIABLE2] replaced by VARIABLE3. This is possible with 2to3, but requires explicit code to do the traversal and replacement.
  • EXPRESSION1PRIME then must not contain no further occurrences of VARIABLE1. This can also be checked with explicit code.
  • One needs to come up with a name for VARIABLE3. You have chosen x; there is no reasonable way to have this done automatically. You could chose to recycle VARIABLE1 (i.e. i) for that, but that may be confusing as it suggests that i is still an index. It might work to pick a synthetic name, such as VARIABLE1_VARIABLE2 (i.e. myList_i), and check whether that's not used otherwise.
  • One needs to be sure that VARIABLE1[VARIABLE2] yields the same as you get when using iter(VARIABLE1). It's not possible to do this automatically.

If you want to learn how to write 2to3 fixers, take a look at Lennart Regebro's book.

Jim Ferrans
  • 30,582
  • 12
  • 56
  • 83
Martin v. Löwis
  • 124,830
  • 17
  • 198
  • 235