430

Is there a simple way to flatten a list of iterables with a list comprehension, or failing that, what would you all consider to be the best way to flatten a shallow list like this, balancing performance and readability?

I tried to flatten such a list with a nested list comprehension, like this:

[image for image in menuitem for menuitem in list_of_menuitems]

But I get in trouble of the NameError variety there, because the name 'menuitem' is not defined. After googling and looking around on Stack Overflow, I got the desired results with a reduce statement:

reduce(list.__add__, map(lambda x: list(x), list_of_menuitems))

But this method is fairly unreadable because I need that list(x) call there because x is a Django QuerySet object.

Conclusion:

Thanks to everyone who contributed to this question. Here is a summary of what I learned. I'm also making this a community wiki in case others want to add to or correct these observations.

My original reduce statement is redundant and is better written this way:

>>> reduce(list.__add__, (list(mi) for mi in list_of_menuitems))

This is the correct syntax for a nested list comprehension (Brilliant summary dF!):

>>> [image for mi in list_of_menuitems for image in mi]

But neither of these methods are as efficient as using itertools.chain:

>>> from itertools import chain
>>> list(chain(*list_of_menuitems))

And as @cdleary notes, it's probably better style to avoid * operator magic by using chain.from_iterable like so:

>>> chain = itertools.chain.from_iterable([[1,2],[3],[5,89],[],[6]])
>>> print(list(chain))
>>> [1, 2, 3, 5, 89, 6]
Community
  • 1
  • 1
prairiedogg
  • 6,323
  • 8
  • 44
  • 52
  • 31
    I don't get why everybody is using map(lambda x: list(x), other) -- isn't that equivalent to map(list, other)? The list builtin is callable... – cdleary Jan 02 '09 at 19:14
  • It is equivalent. Luckily Prairie Dogg realized that this code is ugly. :) – recursive Jan 02 '09 at 19:33
  • Oh yeah, I see that you pointed it out in your answer @recursive. Sorry for the redundancy! :-) – cdleary Jan 02 '09 at 20:56
  • 2
    @recursive: Yeah, I definitely blushed after you pointed out just how many things about my reduce statement were redundant. I definitely learned a lot from this question, so big thanks to everyone! – prairiedogg Jan 02 '09 at 23:29
  • I've been learning ruby for a while now and just had occasion to use a ruby idiom for a similar problem. FWIW: [[1,2],[3],[5,89],[],[6]].flatten -> [1, 2, 3, 5, 89, 6] – prairiedogg Nov 20 '10 at 16:40
  • 1
    reduce(list.__add__, (list(mi.image_set.all()) for mi in list_of_menuitems)) is not correct for the case where all the lists are empty. It should be reduce(list.__add__, (list(mi.image_set.all()) for mi in list_of_menuitems), []) – Daira Hopwood Jun 10 '12 at 11:59
  • 1
    This question made http://stackoverflow.com/q/952914/1206998 closed as duplicated. However, it is much less clear due to all the django irrelevant stuff. Should it be re-written? – Juh_ Jan 17 '14 at 13:55
  • "But I get in trouble of the NameError variety there, because the name 'menuitem' is not defined." The problem with this approach is simply that the `for` clauses are in the wrong order. I added another duplicate link to cover the proper technique for that approach. – Karl Knechtel Mar 02 '23 at 01:32

23 Answers23

336

If you're just looking to iterate over a flattened version of the data structure and don't need an indexable sequence, consider itertools.chain and company.

>>> list_of_menuitems = [['image00', 'image01'], ['image10'], []]
>>> import itertools
>>> chain = itertools.chain(*list_of_menuitems)
>>> print(list(chain))
['image00', 'image01', 'image10']

It will work on anything that's iterable, which should include Django's iterable QuerySets, which it appears that you're using in the question.

Edit: This is probably as good as a reduce anyway, because reduce will have the same overhead copying the items into the list that's being extended. chain will only incur this (same) overhead if you run list(chain) at the end.

Meta-Edit: Actually, it's less overhead than the question's proposed solution, because you throw away the temporary lists you create when you extend the original with the temporary.

Edit: As J.F. Sebastian says itertools.chain.from_iterable avoids the unpacking and you should use that to avoid * magic, but the timeit app shows negligible performance difference.

Community
  • 1
  • 1
cdleary
  • 69,512
  • 53
  • 163
  • 191
  • 1
    An explicit loop that uses [`.extend` method is the fastest solution according to this benchmark](http://stackoverflow.com/questions/406121/flattening-a-shallow-list-in-python#comment22511343_408281) – jfs Apr 30 '14 at 02:02
  • hadn't heard from from_iterable. it's prettier than the *, if less pythonic – Jules G.M. Mar 29 '16 at 22:04
  • 1
    It's also worth highlighting that because `from_iterable` avoids unpacking, it can avoid issues where you have many (potentially unbounded) items in the iterable. If the iterable is long enough, you'll run out of memory. – skeggse Feb 27 '18 at 02:32
299

You almost have it! The way to do nested list comprehensions is to put the for statements in the same order as they would go in regular nested for statements.

Thus, this

for inner_list in outer_list:
    for item in inner_list:
        ...

corresponds to

[... for inner_list in outer_list for item in inner_list]

So you want

[image for menuitem in list_of_menuitems for image in menuitem]
Juh_
  • 14,628
  • 8
  • 59
  • 92
dF.
  • 74,139
  • 30
  • 130
  • 136
  • 46
    +1, I have looked this up so many times and this is the only answer I've seen that's made the ordering explicit... Maybe now I can remember it! – Izkata Mar 20 '12 at 15:47
  • 11
    I wish I could up vote this again because this way of thinking makes nested list comprehensions much easier to understand. – Derek Litz Jun 17 '13 at 21:36
  • whereas [... for item in inner_list for inner_list in outer_list] is a Python gotcha: it only repeats `[... for item in inner_list]` on the last value of inner_list, and as many times as len(outer_list). Useless. – smci Sep 14 '13 at 04:47
  • 2
    This ordering is *really* odd. If you change `for i in list: ...` to `... for i in list`, then why wouldn't you also change the order of the for loops? – naught101 Nov 26 '13 at 06:38
  • Best explanation of nested list comprehensions I've found even after reading the documentation. Commenting so I can find this again later. – Rob Nov 29 '14 at 17:43
  • Yes, old post but the best simple explanation of nested comprehensions. Thank you... Its still awkward to have the flow jump back to the front, but now it makes sense. – Andrew Aug 05 '15 at 03:09
  • "in the same order as they would go in regular nested for statements" is a very clear way to put it. Now I will not forget this again. – clacke Dec 18 '15 at 11:53
  • 1
    Hah! I forgot about it again. I guess Guido's brain and mine just disagree on what's intuitive. – clacke Jun 27 '16 at 13:33
133

@S.Lott: You inspired me to write a timeit app.

I figured it would also vary based on the number of partitions (number of iterators within the container list) -- your comment didn't mention how many partitions there were of the thirty items. This plot is flattening a thousand items in every run, with varying number of partitions. The items are evenly distributed among the partitions.

Flattening Comparison

Code (Python 2.6):

#!/usr/bin/env python2.6

"""Usage: %prog item_count"""

from __future__ import print_function

import collections
import itertools
import operator
from timeit import Timer
import sys

import matplotlib.pyplot as pyplot

def itertools_flatten(iter_lst):
    return list(itertools.chain(*iter_lst))

def itertools_iterable_flatten(iter_iter):
    return list(itertools.chain.from_iterable(iter_iter))

def reduce_flatten(iter_lst):
    return reduce(operator.add, map(list, iter_lst))

def reduce_lambda_flatten(iter_lst):
    return reduce(operator.add, map(lambda x: list(x), [i for i in iter_lst]))

def comprehension_flatten(iter_lst):
    return list(item for iter_ in iter_lst for item in iter_)

METHODS = ['itertools', 'itertools_iterable', 'reduce', 'reduce_lambda',
           'comprehension']

def _time_test_assert(iter_lst):
    """Make sure all methods produce an equivalent value.
    :raise AssertionError: On any non-equivalent value."""
    callables = (globals()[method + '_flatten'] for method in METHODS)
    results = [callable(iter_lst) for callable in callables]
    if not all(result == results[0] for result in results[1:]):
        raise AssertionError

def time_test(partition_count, item_count_per_partition, test_count=10000):
    """Run flatten methods on a list of :param:`partition_count` iterables.
    Normalize results over :param:`test_count` runs.
    :return: Mapping from method to (normalized) microseconds per pass.
    """
    iter_lst = [[dict()] * item_count_per_partition] * partition_count
    print('Partition count:    ', partition_count)
    print('Items per partition:', item_count_per_partition)
    _time_test_assert(iter_lst)
    test_str = 'flatten(%r)' % iter_lst
    result_by_method = {}
    for method in METHODS:
        setup_str = 'from test import %s_flatten as flatten' % method
        t = Timer(test_str, setup_str)
        per_pass = test_count * t.timeit(number=test_count) / test_count
        print('%20s: %.2f usec/pass' % (method, per_pass))
        result_by_method[method] = per_pass
    return result_by_method

if __name__ == '__main__':
    if len(sys.argv) != 2:
        raise ValueError('Need a number of items to flatten')
    item_count = int(sys.argv[1])
    partition_counts = []
    pass_times_by_method = collections.defaultdict(list)
    for partition_count in xrange(1, item_count):
        if item_count % partition_count != 0:
            continue
        items_per_partition = item_count / partition_count
        result_by_method = time_test(partition_count, items_per_partition)
        partition_counts.append(partition_count)
        for method, result in result_by_method.iteritems():
            pass_times_by_method[method].append(result)
    for method, pass_times in pass_times_by_method.iteritems():
        pyplot.plot(partition_counts, pass_times, label=method)
    pyplot.legend()
    pyplot.title('Flattening Comparison for %d Items' % item_count)
    pyplot.xlabel('Number of Partitions')
    pyplot.ylabel('Microseconds')
    pyplot.show()

Edit: Decided to make it community wiki.

Note: METHODS should probably be accumulated with a decorator, but I figure it'd be easier for people to read this way.

Community
  • 1
  • 1
cdleary
  • 69,512
  • 53
  • 163
  • 191
  • Try `sum_flatten = lambda iter_lst: sum(map(list, iter_lst), [])` – jfs Jan 08 '09 at 16:34
  • 20
    or just sum(list, []) – hoju Sep 01 '09 at 06:32
  • @EnTerr suggested `reduce(operator.iadd` http://stackoverflow.com/questions/3040335/finding-elements-in-python-association-lists-efficiently/3041450#3041450 that is the fastest so far (code: http://ideone.com/NWThp picture: http://i403.photobucket.com/albums/pp111/uber_ulrich/p1000.png ) – jfs Jun 16 '10 at 12:43
  • 2
    `chain.from_iterable()` is slightly faster if there are many partitions http://i403.photobucket.com/albums/pp111/uber_ulrich/p10000.png – jfs Jun 16 '10 at 13:21
  • 3
    I know this is an old thread but I added a method I got from [here](http://stackoverflow.com/a/5330178/2069974) which uses list.extend which has shown to be the fastest across the board. [graph](https://www.dropbox.com/s/lb7yojmcckndojm/flatten_graph.png) [updated gist](https://gist.github.com/mshuffett/5316046) – Mike S Apr 05 '13 at 02:04
55

sum(list_of_lists, []) would flatten it.

l = [['image00', 'image01'], ['image10'], []]
print sum(l,[]) # prints ['image00', 'image01', 'image10']
beroe
  • 11,784
  • 5
  • 34
  • 79
Prem Anand
  • 2,469
  • 16
  • 16
  • I like it! It reminds me of using `iter[::-1]` instead of `sorted(iter, reverse=True)`. I wonder if this is one of those things that'll get scrutinized over the years as "bad Python". It strikes me as a very *TIMTOWTDI* solution. – yurisich Nov 03 '20 at 22:25
43

This solution works for arbitrary nesting depths - not just the "list of lists" depth that some (all?) of the other solutions are limited to:

def flatten(x):
    result = []
    for el in x:
        if hasattr(el, "__iter__") and not isinstance(el, basestring):
            result.extend(flatten(el))
        else:
            result.append(el)
    return result

It's the recursion which allows for arbitrary depth nesting - until you hit the maximum recursion depth, of course...

James Brady
  • 27,032
  • 8
  • 51
  • 59
  • 1
    It might be worth adding `hasattr(el, '__getitem__')` for compatibility with `iter()` function and builtin for-in loop (though all Python sequences (objects with `__getitem__`) also are iterable (object with `__iter__`)). – jfs Mar 14 '09 at 10:56
  • 1
    i was expecting something like that already in itertools. are there similar solutions using comprehensions? – Josep Valls Jun 18 '12 at 14:07
  • 3
    This was the most useful to me as it doesn't separate strings. – Chris Hagmann Jun 27 '13 at 20:49
  • 1
    @JosepVallsm nice solution! for python3 you need to use `str` instead of `basestring`, [The builtin basestring abstract type was removed. Use str instead. The str and bytes types don’t have functionality enough in common to warrant a shared base class. The 2to3 tool (see below) replaces every occurrence of basestring with str.](https://docs.python.org/3.0/whatsnew/3.0.html) – Anu Jan 12 '19 at 19:48
  • @JosepValls, also, could you tell why a similar [method like yours](https://stackoverflow.com/a/36427768/6484358) gives a `RECURSION ERROR ON` input `A = ['str1', [[[['str2']]]], [['str3'], 'str4'], 'str5'] and input `A = [1.0, 2, 'a', (4,), ((6,), (8,)), (((8,),(9,)), ((12,),(10)))]`, but work fine with your solution! – Anu Jan 12 '19 at 19:51
25

In Python 2.6, using chain.from_iterable():

>>> from itertools import chain
>>> list(chain.from_iterable(mi.image_set.all() for mi in h.get_image_menu()))

It avoids creating of intermediate list.

jfs
  • 399,953
  • 195
  • 994
  • 1,670
24

Performance Results. Revised.

import itertools
def itertools_flatten( aList ):
    return list( itertools.chain(*aList) )

from operator import add
def reduce_flatten1( aList ):
    return reduce(add, map(lambda x: list(x), [mi for mi in aList]))

def reduce_flatten2( aList ):
    return reduce(list.__add__, map(list, aList))

def comprehension_flatten( aList ):
    return list(y for x in aList for y in x)

I flattened a 2-level list of 30 items 1000 times

itertools_flatten     0.00554
comprehension_flatten 0.00815
reduce_flatten2       0.01103
reduce_flatten1       0.01404

Reduce is always a poor choice.

S.Lott
  • 384,516
  • 81
  • 508
  • 779
16

There seems to be a confusion with operator.add! When you add two lists together, the correct term for that is concat, not add. operator.concat is what you need to use.

If you're thinking functional, it is as easy as this::

>>> from functools import reduce
>>> import operator
>>> list2d = ((1,2,3),(4,5,6), (7,), (8,9))
>>> reduce(operator.concat, list2d)
(1, 2, 3, 4, 5, 6, 7, 8, 9)

You see reduce respects the sequence type, so when you supply a tuple, you get back a tuple. let's try with a list::

>>> list2d = [[1,2,3],[4,5,6], [7], [8,9]]
>>> reduce(operator.concat, list2d)
[1, 2, 3, 4, 5, 6, 7, 8, 9]

Aha, you get back a list.

How about performance::

>>> list2d = [[1,2,3],[4,5,6], [7], [8,9]]
>>> %timeit list(itertools.chain.from_iterable(list2d))
1000000 loops, best of 3: 1.36 µs per loop

from_iterable is pretty fast! But it's no comparison to reduce with concat.

>>> list2d = ((1,2,3),(4,5,6), (7,), (8,9))
>>> %timeit reduce(operator.concat, list2d)
1000000 loops, best of 3: 492 ns per loop
Jeff Hykin
  • 1,846
  • 16
  • 25
Meitham
  • 9,178
  • 5
  • 34
  • 45
  • it is probably the best solution for one level of nesting. but this might be a too restrictive constraint. YMMV – LBarret Jul 23 '19 at 21:00
9

Here is the correct solution using list comprehensions (they're backward in the question):

>>> join = lambda it: (y for x in it for y in x)
>>> list(join([[1,2],[3,4,5],[]]))
[1, 2, 3, 4, 5]

In your case it would be

[image for menuitem in list_of_menuitems for image in menuitem.image_set.all()]

or you could use join and say

join(menuitem.image_set.all() for menuitem in list_of_menuitems)

In either case, the gotcha was the nesting of the for loops.

Josh Lee
  • 171,072
  • 38
  • 269
  • 275
8

Off the top of my head, you can eliminate the lambda:

reduce(list.__add__, map(list, [mi.image_set.all() for mi in list_of_menuitems]))

Or even eliminate the map, since you've already got a list-comp:

reduce(list.__add__, [list(mi.image_set.all()) for mi in list_of_menuitems])

You can also just express this as a sum of lists:

sum([list(mi.image_set.all()) for mi in list_of_menuitems], [])
recursive
  • 83,943
  • 34
  • 151
  • 241
  • You could just use add, and I believe the second argument to sum is redundant. – daniel Jan 02 '09 at 06:30
  • 2
    It's not redundant. The default is zero, yielding TypeError: unsupported operand type(s) for +: 'int' and 'list'. IMO sum() is more direct than reduce(add, ...) – recursive Jan 02 '09 at 06:52
4

This version is a generator.Tweak it if you want a list.

def list_or_tuple(l):
    return isinstance(l,(list,tuple))
## predicate will select the container  to be flattened
## write your own as required
## this one flattens every list/tuple


def flatten(seq,predicate=list_or_tuple):        
    ## recursive generator 
    for i in seq:
        if predicate(seq):
            for j in flatten(i):
                yield j
        else:
            yield i

You can add a predicate ,if want to flatten those which satisfy a condition

Taken from python cookbook

devsaw
  • 1,007
  • 2
  • 14
  • 28
4

Here is a version working for multiple levels of list using collectons.Iterable:

import collections

def flatten(o, flatten_condition=lambda i: isinstance(i,
               collections.Iterable) and not isinstance(i, str)):
    result = []
    for i in o:
        if flatten_condition(i):
            result.extend(flatten(i, flatten_condition))
        else:
            result.append(i)
    return result
Pierre Thibault
  • 1,895
  • 2
  • 19
  • 22
  • Could please suggest why your solution gives an `RecursionError: maximum recursion depth exceeded in comparison` on this input `A = ['image1', [[[['image2']]]], [['image3'], 'image4'], 'image5']` , while it run fine and unflatten this input `A = [1,[2,3],[4,5,[6,[7,8],9]]]` – Anu Jan 12 '19 at 18:29
  • 1
    It is a problem with the flatten condition. Since strings are iterable then they are flattened as characters which are themselves strings of length one and because they are strings then the same logic is applied again and it creates an infinite loop. So I created a new version with a flattening condition for more control. – Pierre Thibault Jan 12 '19 at 22:53
  • Great! big Thanks for the clarification, it's working now.! I kind of understood you reasoning but unable to digest it completely. Could you please point me to some article on the web or any post that helps to understand its issue! What I understood is ` ['image1'] -->['i','m','a','g','e','1'] ` i.e strings of length one!, and now how it will go in infinite loop and what's making to go in infinite loop? that part I haven't understood yet! can you help in some way! – Anu Jan 13 '19 at 00:17
  • 1
    For the function flatten to end, if it goes inside the for loop, it needs to go in the else statement at some point. If it goes in the else statement, then it will start to unfold the call stack and return a result. Based on the previous version, because 'image1' is iterable then o is going to be equal to 'image1' while i is going to be equal to 'i'. 'i' is also iterable so in the next call o is going to be equal to 'i' while i is also going to be equal to 'i'. The function is going to be called again leading to the exact same state and an infinite loop only broken by a stack overflow. – Pierre Thibault Jan 13 '19 at 05:28
  • It is better to use `yield` to generate the sequence of items via the `result` list. The iterator can be lazily evaluated and the fn using this can consume the sequence as required. – kopos Jan 14 '19 at 10:28
4

If you have to flat a more complicated list with not iterable elements or with depth more than 2 you can use following function:

def flat_list(list_to_flat):
    if not isinstance(list_to_flat, list):
        yield list_to_flat
    else:
        for item in list_to_flat:
            yield from flat_list(item)

It will return a generator object which you can convert to a list with list() function. Notice that yield from syntax is available starting from python3.3, but you can use explicit iteration instead.
Example:

>>> a = [1, [2, 3], [1, [2, 3, [1, [2, 3]]]]]
>>> print(list(flat_list(a)))
[1, 2, 3, 1, 2, 3, 1, 2, 3]
QNA
  • 1,047
  • 2
  • 14
  • 26
  • this solution gives gives `RECURSION ERROR ON :` input `A = ['str1', [[[['str2']]]], [['str3'], 'str4'], 'str5']` and `A = [1.0, 2, 'a', [4,], [[6,], [8,]], [[[8,],[9,]], [[12,],[10]]]]`. Do you know why and how to fix it? – Anu Jan 12 '19 at 20:15
  • @anu It worked without errors on your examples for me (python 3.7.1). I'm not sure why it doesn't work you. – QNA Jan 12 '19 at 23:46
  • I am using python3.6, I found the problem now, you need to add `or isinstance(list_to_flat, str)` to the first if condition since it has to guard against strings. Your solution was perfect for input `A = [1, [[[[2]]]], [[3], 4], 5]` but fails when you use strings!, did test with strings in python3.7? – Anu Jan 13 '19 at 00:28
  • @anu I tested it on the exact same examples you provided. Your first example was with strings and it worked fine. The first if statement says to return any non-list item as is, without flattening. That includes strings too, no extra conditions are needed. – QNA Jan 13 '19 at 01:06
  • oh ok, it could be due to differences in python version! They might have rolled out some updates in 3.7 – Anu Jan 13 '19 at 01:10
  • man! This is just what I needed! Thank you so much! – Gabs Nov 29 '19 at 23:30
3

have you tried flatten? From matplotlib.cbook.flatten(seq, scalarp=) ?

l=[[1,2,3],[4,5,6], [7], [8,9]]*33

run("list(flatten(l))")
         3732 function calls (3303 primitive calls) in 0.007 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.007    0.007 <string>:1(<module>)
      429    0.001    0.000    0.001    0.000 cbook.py:475(iterable)
      429    0.002    0.000    0.003    0.000 cbook.py:484(is_string_like)
      429    0.002    0.000    0.006    0.000 cbook.py:565(is_scalar_or_string)
  727/298    0.001    0.000    0.007    0.000 cbook.py:605(flatten)
      429    0.000    0.000    0.001    0.000 core.py:5641(isMaskedArray)
      858    0.001    0.000    0.001    0.000 {isinstance}
      429    0.000    0.000    0.000    0.000 {iter}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}



l=[[1,2,3],[4,5,6], [7], [8,9]]*66

run("list(flatten(l))")
         7461 function calls (6603 primitive calls) in 0.007 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.007    0.007 <string>:1(<module>)
      858    0.001    0.000    0.001    0.000 cbook.py:475(iterable)
      858    0.002    0.000    0.003    0.000 cbook.py:484(is_string_like)
      858    0.002    0.000    0.006    0.000 cbook.py:565(is_scalar_or_string)
 1453/595    0.001    0.000    0.007    0.000 cbook.py:605(flatten)
      858    0.000    0.000    0.001    0.000 core.py:5641(isMaskedArray)
     1716    0.001    0.000    0.001    0.000 {isinstance}
      858    0.000    0.000    0.000    0.000 {iter}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}



l=[[1,2,3],[4,5,6], [7], [8,9]]*99

run("list(flatten(l))")
         11190 function calls (9903 primitive calls) in 0.010 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.010    0.010 <string>:1(<module>)
     1287    0.002    0.000    0.002    0.000 cbook.py:475(iterable)
     1287    0.003    0.000    0.004    0.000 cbook.py:484(is_string_like)
     1287    0.002    0.000    0.009    0.000 cbook.py:565(is_scalar_or_string)
 2179/892    0.001    0.000    0.010    0.000 cbook.py:605(flatten)
     1287    0.001    0.000    0.001    0.000 core.py:5641(isMaskedArray)
     2574    0.001    0.000    0.001    0.000 {isinstance}
     1287    0.000    0.000    0.000    0.000 {iter}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}



l=[[1,2,3],[4,5,6], [7], [8,9]]*132

run("list(flatten(l))")
         14919 function calls (13203 primitive calls) in 0.013 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.013    0.013 <string>:1(<module>)
     1716    0.002    0.000    0.002    0.000 cbook.py:475(iterable)
     1716    0.004    0.000    0.006    0.000 cbook.py:484(is_string_like)
     1716    0.003    0.000    0.011    0.000 cbook.py:565(is_scalar_or_string)
2905/1189    0.002    0.000    0.013    0.000 cbook.py:605(flatten)
     1716    0.001    0.000    0.001    0.000 core.py:5641(isMaskedArray)
     3432    0.001    0.000    0.001    0.000 {isinstance}
     1716    0.001    0.000    0.001    0.000 {iter}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler'

UPDATE Which gave me another idea:

l=[[1,2,3],[4,5,6], [7], [8,9]]*33

run("flattenlist(l)")
         564 function calls (432 primitive calls) in 0.000 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    133/1    0.000    0.000    0.000    0.000 <ipython-input-55-39b139bad497>:4(flattenlist)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
      429    0.000    0.000    0.000    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}



l=[[1,2,3],[4,5,6], [7], [8,9]]*66

run("flattenlist(l)")
         1125 function calls (861 primitive calls) in 0.001 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    265/1    0.001    0.000    0.001    0.001 <ipython-input-55-39b139bad497>:4(flattenlist)
        1    0.000    0.000    0.001    0.001 <string>:1(<module>)
      858    0.000    0.000    0.000    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}



l=[[1,2,3],[4,5,6], [7], [8,9]]*99

run("flattenlist(l)")
         1686 function calls (1290 primitive calls) in 0.001 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    397/1    0.001    0.000    0.001    0.001 <ipython-input-55-39b139bad497>:4(flattenlist)
        1    0.000    0.000    0.001    0.001 <string>:1(<module>)
     1287    0.000    0.000    0.000    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}



l=[[1,2,3],[4,5,6], [7], [8,9]]*132

run("flattenlist(l)")
         2247 function calls (1719 primitive calls) in 0.002 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    529/1    0.001    0.000    0.002    0.002 <ipython-input-55-39b139bad497>:4(flattenlist)
        1    0.000    0.000    0.002    0.002 <string>:1(<module>)
     1716    0.001    0.000    0.001    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}



l=[[1,2,3],[4,5,6], [7], [8,9]]*1320

run("flattenlist(l)")
         22443 function calls (17163 primitive calls) in 0.016 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   5281/1    0.011    0.000    0.016    0.016 <ipython-input-55-39b139bad497>:4(flattenlist)
        1    0.000    0.000    0.016    0.016 <string>:1(<module>)
    17160    0.005    0.000    0.005    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

So to test how effective it is when recursive gets deeper: How much deeper?

l=[[1,2,3],[4,5,6], [7], [8,9]]*1320

new=[l]*33

run("flattenlist(new)")
         740589 function calls (566316 primitive calls) in 0.418 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 174274/1    0.281    0.000    0.417    0.417 <ipython-input-55-39b139bad497>:4(flattenlist)
        1    0.001    0.001    0.418    0.418 <string>:1(<module>)
   566313    0.136    0.000    0.136    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}



new=[l]*66

run("flattenlist(new)")
         1481175 function calls (1132629 primitive calls) in 0.809 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 348547/1    0.542    0.000    0.807    0.807 <ipython-input-55-39b139bad497>:4(flattenlist)
        1    0.002    0.002    0.809    0.809 <string>:1(<module>)
  1132626    0.266    0.000    0.266    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}



new=[l]*99

run("flattenlist(new)")
         2221761 function calls (1698942 primitive calls) in 1.211 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 522820/1    0.815    0.000    1.208    1.208 <ipython-input-55-39b139bad497>:4(flattenlist)
        1    0.002    0.002    1.211    1.211 <string>:1(<module>)
  1698939    0.393    0.000    0.393    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}



new=[l]*132

run("flattenlist(new)")
         2962347 function calls (2265255 primitive calls) in 1.630 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 697093/1    1.091    0.000    1.627    1.627 <ipython-input-55-39b139bad497>:4(flattenlist)
        1    0.003    0.003    1.630    1.630 <string>:1(<module>)
  2265252    0.536    0.000    0.536    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}



new=[l]*1320

run("flattenlist(new)")
         29623443 function calls (22652523 primitive calls) in 16.103 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
6970921/1   10.842    0.000   16.069   16.069 <ipython-input-55-39b139bad497>:4(flattenlist)
        1    0.034    0.034   16.103   16.103 <string>:1(<module>)
 22652520    5.227    0.000    5.227    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

I will bet "flattenlist" I am going to use this rather than matploblib for a long long time unless I want a yield generator and fast result as "flatten" uses in matploblib.cbook

This, is fast.

  • And here is the code

:

typ=(list,tuple)


def flattenlist(d):
    thelist = []
    for x in d:
        if not isinstance(x,typ):
            thelist += [x]
        else:
            thelist += flattenlist(x)
    return thelist
user2290820
  • 2,709
  • 5
  • 34
  • 62
3

From my experience, the most efficient way to flatten a list of lists is:

flat_list = []
map(flat_list.extend, list_of_list)

Some timeit comparisons with the other proposed methods:

list_of_list = [range(10)]*1000
%timeit flat_list=[]; map(flat_list.extend, list_of_list)
#10000 loops, best of 3: 119 µs per loop
%timeit flat_list=list(itertools.chain.from_iterable(list_of_list))
#1000 loops, best of 3: 210 µs per loop
%timeit flat_list=[i for sublist in list_of_list for i in sublist]
#1000 loops, best of 3: 525 µs per loop
%timeit flat_list=reduce(list.__add__,list_of_list)
#100 loops, best of 3: 18.1 ms per loop

Now, the efficiency gain appears better when processing longer sublists:

list_of_list = [range(1000)]*10
%timeit flat_list=[]; map(flat_list.extend, list_of_list)
#10000 loops, best of 3: 60.7 µs per loop
%timeit flat_list=list(itertools.chain.from_iterable(list_of_list))
#10000 loops, best of 3: 176 µs per loop

And this methods also works with any iterative object:

class SquaredRange(object):
    def __init__(self, n): 
        self.range = range(n)
    def __iter__(self):
        for i in self.range: 
            yield i**2

list_of_list = [SquaredRange(5)]*3
flat_list = []
map(flat_list.extend, list_of_list)
print flat_list
#[0, 1, 4, 9, 16, 0, 1, 4, 9, 16, 0, 1, 4, 9, 16]
Juh_
  • 14,628
  • 8
  • 59
  • 92
  • 1
    Not working in 3.9 – misantroop Jun 21 '22 at 15:25
  • This code as written doesn't work, `map(flat_list.extend, list_of_list)` references `list_of_list` before it's defined. Can you please fix your code? Also, you need to explain that the `%timeit` magic comes from ipython, not base Python. – smci Jun 24 '23 at 23:30
3

If you're looking for a built-in, simple, one-liner you can use:

a = [[1, 2, 3], [4, 5, 6]
b = [i[x] for i in a for x in range(len(i))]
print b

returns

[1, 2, 3, 4, 5, 6]
3
def is_iterable(item):
   return isinstance(item, list) or isinstance(item, tuple)


def flatten(items):
    for i in items:
        if is_iterable(item):
            for m in flatten(i):
                yield m
        else:
            yield i

Test:

print list(flatten2([1.0, 2, 'a', (4,), ((6,), (8,)), (((8,),(9,)), ((12,),(10)))]))
kopos
  • 406
  • 6
  • 12
  • 1
    This might flatten strings into individual characters, which might not be intended behaviour? – ericmjl Mar 23 '18 at 12:11
  • 1
    Yes, I did not consider that condition. Thanks. – kopos Mar 24 '18 at 06:34
  • @kopos, thanks for your solution, but, I am getting this error `for m in flatten(i): [Previous line repeated 996 more times] RecursionError: maximum recursion depth exceeded` on your input `A = [1.0, 2, 'a', (4,), ((6,), (8,)), (((8,),(9,)), ((12,),(10)))]` and `A = ['str1', [[[['str2']]]], [['str3'], 'str4'], 'str5']`, but it works fine on this input `A = [1, [[[[2]]]], [[3], 4], 5]`. Do you know what's the reason for its failure? and how to fix it? any suggestions? – Anu Jan 12 '19 at 18:47
  • @kopos, I got a fix now!, you need to add one more condition to your if statement `and not isinstance(i,str )` to safeguard against strings in the list while flattening! – Anu Jan 13 '19 at 00:35
  • 1
    @anu: Yes that fix works! But the problem is, we are identifying the collection type based on `hasattr` and `isinstance`. If we know the type of collection nodes, the fn can be customized for the same. We might have to tweak the function too based on how would it need to behave if the collection is a `set` – kopos Jan 14 '19 at 10:21
  • @kopos, you are right, there is a way you can generalize over collections, by using `typing.Sequence` and use `Sequence` instead of fixed collection type such as `list, strings, sets, range etc!`, moreover just to safeguard against strings we add the previous condition as I mentioned before [here is what I am talking about](https://qr.ae/TUngyN) – Anu Jan 14 '19 at 17:39
2

pylab provides a flatten: link to numpy flatten

ntg
  • 12,950
  • 7
  • 74
  • 95
  • Note: Flatten does not work with jagged arrays. Try using [hstack](http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.hstack.html) instead. – Stewbaca Jan 27 '16 at 17:48
2

What about:

from operator import add
reduce(add, map(lambda x: list(x.image_set.all()), [mi for mi in list_of_menuitems]))

But, Guido is recommending against performing too much in a single line of code since it reduces readability. There is minimal, if any, performance gain by performing what you want in a single line vs. multiple lines.

daniel
  • 2,568
  • 24
  • 32
  • 2
    It's incredibly satisfying performing some crazy amount of work in a single line... but it's really just syntactic suger – daniel Jan 02 '09 at 06:37
  • 1
    If I remember correctly, Guido is actually recommending against the use of reduce and list comprehensions as well... I disagree though, they are incredibly useful. – daniel Jan 02 '09 at 06:39
  • 1
    Check the performance of this little nugget versus a multi-line function. I think you'll find that this one-liner is a real dog. – S.Lott Jan 02 '09 at 11:21
  • 1
    probably, mapping with lambdas is horrible. the overhead incurred for each function call sucks the life out of your code. I never said that that particular line was as fast as a multiple line solution... ;) – daniel Jan 03 '09 at 03:34
1

If each item in the list is a string (and any strings inside those strings use " " rather than ' '), you can use regular expressions (re module)

>>> flattener = re.compile("\'.*?\'")
>>> flattener
<_sre.SRE_Pattern object at 0x10d439ca8>
>>> stred = str(in_list)
>>> outed = flattener.findall(stred)

The above code converts in_list into a string, uses the regex to find all the substrings within quotes (i.e. each item of the list) and spits them out as a list.

user22723
  • 167
  • 1
  • 7
1

A simple alternative is to use numpy's concatenate but it converts the contents to float:

import numpy as np
print np.concatenate([[1,2],[3],[5,89],[],[6]])
# array([  1.,   2.,   3.,   5.,  89.,   6.])
print list(np.concatenate([[1,2],[3],[5,89],[],[6]]))
# [  1.,   2.,   3.,   5.,  89.,   6.]
strpeter
  • 2,562
  • 3
  • 27
  • 48
1

The easiest way to achieve this in either Python 2 or 3 is to use the morph library using pip install morph.

The code is:

import morph

list = [[1,2],[3],[5,89],[],[6]]
flattened_list = morph.flatten(list)  # returns [1, 2, 3, 5, 89, 6]
YPCrumble
  • 26,610
  • 23
  • 107
  • 172
  • 1
    "easiest" is [a strong word](http://stackoverflow.com/a/19053585/923794) – cfi Dec 17 '15 at 12:19
  • @cfi The answer you suggested doesn't work in Python 2 and from the comments doesn't sound like it is even an acceptable answer in Python 3. The morph library is a simple one function solution like you have in lodash for javascript. In any case I edited my answer to clarify that it is the easiest solution that works across Python 2 and 3. – YPCrumble Dec 17 '15 at 17:46
  • I do apologize. My comment was a bit lazy, especially since you pointed out my own comment on the other post. The point I wanted to make is that "easiest" is a superlativ which is hard to achieve. Your suggestion requires an external library which might be hard to install for some (even with venv and such). Since the question is about "shallow" lists and about "balancing performance and readability" your answer might(!) win on readability. But [this one](http://stackoverflow.com/a/32253807/923794) wins on performance and is easier in that it needs no dependencies. – cfi Dec 17 '15 at 18:27
  • @cfi yeah - mine might be the "lazy man's approach". For me, seeing all these ways of flattening made me want to just find a quick library command like I found with morph. The nice thing about this library is that it's much smaller than numpy (I have to use a swapfile to install numpy on small server instances). It basically uses the function you describe in your second comment; the other option would have been for me to use that as a helper function in my code. No problem at all, thanks for pointing out the options :). – YPCrumble Dec 17 '15 at 18:53
0

In Python 3.4 you will be able to do:

[*innerlist for innerlist in outer_list]
elyase
  • 39,479
  • 12
  • 112
  • 119
  • 1
    Hm. While I'd welcome this, this was already discussed way back for Py3.0. Now the [PEP 448](http://www.python.org/dev/peps/pep-0448/) is there, but still in 'Draft' mode. The related [bug ticket](http://bugs.python.org/issue2292) is still in 'patch review' with a yet incomplete patch. Until the bug isn't marked as 'commited' I'd be careful with getting hopes up and saying 'you will be abled to do'. – cfi Sep 27 '13 at 17:33
  • I get what you mean but it was recently announced at Kiwi PyCon 2013 by one of the core developers as "accepted for release" in 3.4. Still not 100% sure but I guess highly probable. – elyase Sep 27 '13 at 21:09
  • Let's both hope it's just the docs lacking behind code as usual for sw before any release ;-) – cfi Sep 27 '13 at 22:12
  • `SyntaxError: can use starred expression only as assignment target` – dawg Apr 09 '15 at 21:05
  • 4
    This syntax was [not accepted](https://www.python.org/dev/peps/pep-0448/#variations) in the final PEP 448 – dawg Apr 09 '15 at 21:13