14

One of the major strengths of python and a few other (functional) programming languages are the list comprehension. They allow programmers to write complex expressions in 1 line. They may be confusing at first but if one gets used to the syntax, it is much better than nested complicated for loops.

With that said, please share with me some of the coolest uses of list comprehensions. (By cool, I just mean useful) It could be for some programming contest, or a production system.

For example: To do the transpose of a matrix mat

>>> mat = [
...        [1, 2, 3],
...        [4, 5, 6],
...        [7, 8, 9],
...       ]

>>> [[row[i] for row in mat] for i in [0, 1, 2]]
[[1, 4, 7], [2, 5, 8], [3, 6, 9]]

Please include a description of the expression and where it was used (if possible).

christangrant
  • 9,276
  • 4
  • 26
  • 27

8 Answers8

16

A lot of people don't know that Python allows you to filter the results of a list comprehension using if:

>>> [i for i in range(10) if i % 2 == 0]
[0, 2, 4, 6, 8]
kerkeslager
  • 1,364
  • 4
  • 17
  • 34
9

I often use comprehensions to construct dicts:

my_dict = dict((k, some_func(k)) for k in input_list)

Note Python 3 has dict comprehensions, so this becomes:

my_dict = {k:some_func(k) for k in input_list}

For constructing CSV-like data from a list of tuples:

data = "\n".join(",".join(x) for x in input)

Not actually a list comprehension, but still useful: Produce a list of ranges from a list of 'cut points':

ranges = zip(cuts, cuts[1:])
Nick Johnson
  • 100,655
  • 16
  • 128
  • 198
  • Not a single one of those is a list comprehension. Those are generator expressions. – Ignacio Vazquez-Abrams May 24 '10 at 02:11
  • A list comprehension is simply a special case of a genexp, though. I can put square brackets around them if that'd make you happy, but I presume the OP was interested in the technique, not the details. – Nick Johnson May 24 '10 at 02:27
8

To do the transpose of a matrix mat:

>>> [list(row) for row in zip(*mat)]
[[1, 4, 7], [2, 5, 8], [3, 6, 9]]
Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452
  • 4
    `map(list, zip(*mat))` is a bit terser (though not a list comprehension). – Grumdrig May 23 '10 at 21:47
  • 1
    Terse is a negative, not a positive. Avoiding map() is one major benefit of list comprehensions in the first place. – Glenn Maynard May 23 '10 at 23:28
  • 7
    @Glenn That's an awfully broad generalization. Given that comprehensions are a terser way to express loops, if terse is always bad, you should always expand them into loops. – Nick Johnson May 24 '10 at 01:08
  • Perhaps I should have said "more elegant". :) I think `map(list,...)` is more transparent. – Grumdrig May 24 '10 at 05:23
  • 1
    Being succinct and readable is a benefit of well-used list comprehensions. "Terse"--to the point of being harder to understand--is the wrong goal, and that's where I think `map(list, zip(*mat))` lies. – Glenn Maynard May 24 '10 at 07:21
  • @Glenn: it's mainly a matter of education I think, the `map` approach is clear to anyone with a functional programming background. – Matthieu M. May 24 '10 at 12:14
  • @Matthieu M.: I understand it just fine; it's not a complicated function. It's just less intuitive to read than a list comprehension, which, in most cases, people can simply parse more quickly once they're used to using them. – Glenn Maynard May 24 '10 at 20:04
8

To flatten a list of lists:

>>> matrix = [[1,2,3], [4,5,6]]
>>> [x for row in matrix for x in row]
[1, 2, 3, 4, 5, 6]
Grumdrig
  • 16,588
  • 14
  • 58
  • 69
  • Just keep in mind that the list comprehension version is a lot more efficient for large lists than sum(matrix ,[]). – David May 03 '13 at 17:43
8

If "cool" means crazy, I like this one:

def cointoss(n,t):
    return (lambda a:"\n".join(str(i)+":\t"+"*"*a.count(i) for i in range(min(a),max(a)+1)))([sum(randint(0,1) for _ in range(n)) for __ in range(t)])

>>> print cointoss(20,100)
3:    **
4:    ***
5:    **
6:    *****
7:    *******
8:    *********
9:    *********
10:   ********************
11:   *****************
12:   *********
13:   *****
14:   *********
15:   *
16:   **

n and t control the number of coin tosses per test and the number of times the test is run and the distribution is plotted.

neil
  • 3,387
  • 1
  • 14
  • 11
4

I use this all the time when loading tab-separated files with optional comment lines starting with a hash mark:

data = [line.strip().split("\t") for line in open("my_file.tab") \
        if not line.startswith('#')]

Of course it works for any other comment and separator character as well.

Tamás
  • 47,239
  • 12
  • 105
  • 124
  • This is awful. Don't crush lots of code into a list comprehnesion--split this up. – Glenn Maynard May 23 '10 at 23:30
  • @Glenn Maynard: Agreed. List comprehensions aren't meant to be used that way, and anyway it defeats the purpose of iterating files by lines because you're just reading it all in at once and using a list comprehension to perform work on the line. Not only that, but...when is the file closed? While I realise the advantage to such a comprehension, there are also disadvantages to it, so do be careful. That's nearly as bad as a call to `malloc` in C without a corresponding call to `free` or `new` without a complementary `delete` in C++. >_ – Dustin May 24 '10 at 02:19
  • 6
    I don't agree that this is awful. I think it's quite readable, and as or more readable than the multiline version would be. – Grumdrig May 24 '10 at 05:26
  • @Dustin: in Python (at least in CPython), a file is closed when the corresponding file object is deallocated. The file object created in the list comprehension has only one reference while it's being used, so it will automatically get destroyed and closed when the interpreter finished processing (see http://stackoverflow.com/questions/575278/how-does-python-close-files-that-have-been-gced). There is no need to close it. And there are many cases when you need the whole dataset at once; for instance, when you want to make a ROC curve from the output of a probabilistic predictor. – Tamás May 24 '10 at 07:59
  • @Grumdrig I would also agree with you, I've been studying pep8 deeply and trying to write better python code. One thing I always disagree with is the hatred of large list comprehensions. They are so much faster than nested for loops. I understand that there are reasons not to use them, however if they are under two lines long they are perfectly readable. – Calvin Ellington Feb 21 '18 at 22:54
3

I currently have several scripts that need to group a set of points into "levels" by height. The assumption is that the z-values of the points will cluster loosely around certain values corresponding to the levels, with large-ish gaps in between the clusters.

So I have the following function:

def level_boundaries(zvalues, threshold=10.0):
    '''Finds all elements z of zvalues such that no other element
    w of zvalues satisfies z <= w < z+threshold.'''
    zvals = zvalues[:]
    zvals.sort()
    return [zvals[i] for i, (a, b) in enumerate(pairs(zvals)) if b-a >= threshold]

"pairs" is taken straight from the itertools module documentation, but for reference:

def pairs(iterable):
    'iterable -> (iterable[n], iterable[n+1]) for n=0, 1, 2, ...'
    from itertools import izip, tee
    first, second = tee(iterable)
    second.next()
    return izip(first, second)

A contrived usage example (my actual data sets are quite a bit too large to use as examples):

>>> import random
>>> z_vals = [100 + random.uniform(-1.5,1.5) for n in range(10)]
>>> z_vals += [120 + random.uniform(-1.5,1.5) for n in range(10)]
>>> z_vals += [140 + random.uniform(-1.5,1.5) for n in range(10)]
>>> random.shuffle(z_vals)
>>> z_vals
[141.33225473458657, 121.1713952666894, 119.40476193163271, 121.09926601186737, 119.63057973814858, 100.09095882968982, 99.226542624083109, 98.845285642062763, 120.90864911044898, 118.65196386994897, 98.902094334035326, 121.2741094217216, 101.18463497862281, 138.93502941970601, 120.71184773326806, 139.15404600347946, 139.56377827641663, 119.28279815624718, 99.338144106822554, 139.05438770927282, 138.95405784704622, 119.54614935118973, 139.9354467277665, 139.47260445000273, 100.02478729763811, 101.34605205591622, 138.97315450408186, 99.186025111246295, 140.53885845445572, 99.893009827114568]
>>> level_boundaries(z_vals)
[101.34605205591622, 121.2741094217216]
Peter Milley
  • 2,768
  • 19
  • 18
2

As long as you are after functional programming inspired parts of Python, consider map, filter, reduce, and zip----all offered in python.

Pierce
  • 346
  • 1
  • 4
  • 1
    Guido's explanation of why list comprehensions generally make more sense than map/filter: http://www.artima.com/weblogs/viewpost.jsp?thread=98196 – Tony Arkles May 24 '10 at 01:56