72

I have a list looks like this:

[[1,2,3],[1,2],[1,4,5,6,7]]

and I want to flatten it into [1,2,3,1,2,1,4,5,6,7]

is there a light weight function to do this without using numpy?

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
wakeupbuddy
  • 971
  • 1
  • 8
  • 17
  • If the goal is to do something *without* Numpy, then it *isn't* a Numpy question and *shouldn't* be tagged that way. That said, it seems strange to *expect* Numpy to be helpful here, since the inputs aren't the same length and thus there's no approach involving a *rectangular* array. Sure, the elements shown here are all integers, but they'd have to be boxed anyway unless we *start with* Numpy arrays. – Karl Knechtel Sep 06 '22 at 00:54

3 Answers3

119

Without numpy ( ndarray.flatten ) one way would be using chain.from_iterable which is an alternate constructor for itertools.chain :

>>> list(chain.from_iterable([[1,2,3],[1,2],[1,4,5,6,7]]))
[1, 2, 3, 1, 2, 1, 4, 5, 6, 7]

Or as another yet Pythonic approach you can use a list comprehension :

[j for sub in [[1,2,3],[1,2],[1,4,5,6,7]] for j in sub]

Another functional approach very suitable for short lists could also be reduce in Python2 and functools.reduce in Python3 (don't use this for long lists):

In [4]: from functools import reduce # Python3

In [5]: reduce(lambda x,y :x+y ,[[1,2,3],[1,2],[1,4,5,6,7]])
Out[5]: [1, 2, 3, 1, 2, 1, 4, 5, 6, 7]

To make it slightly faster you can use operator.add, which is built-in, instead of lambda:

In [6]: from operator import add

In [7]: reduce(add ,[[1,2,3],[1,2],[1,4,5,6,7]])
Out[7]: [1, 2, 3, 1, 2, 1, 4, 5, 6, 7]

In [8]: %timeit reduce(lambda x,y :x+y ,[[1,2,3],[1,2],[1,4,5,6,7]])
789 ns ± 7.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [9]: %timeit reduce(add ,[[1,2,3],[1,2],[1,4,5,6,7]])
635 ns ± 4.38 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

benchmark:

:~$ python -m timeit "from itertools import chain;chain.from_iterable([[1,2,3],[1,2],[1,4,5,6,7]])"
1000000 loops, best of 3: 1.58 usec per loop
:~$ python -m timeit "reduce(lambda x,y :x+y ,[[1,2,3],[1,2],[1,4,5,6,7]])"
1000000 loops, best of 3: 0.791 usec per loop
:~$ python -m timeit "[j for i in [[1,2,3],[1,2],[1,4,5,6,7]] for j in i]"
1000000 loops, best of 3: 0.784 usec per loop

A benchmark on @Will's answer that used sum (its fast for short list but not for long list) :

:~$ python -m timeit "sum([[1,2,3],[4,5,6],[7,8,9]], [])"
1000000 loops, best of 3: 0.575 usec per loop
:~$ python -m timeit "sum([range(100),range(100)], [])"
100000 loops, best of 3: 2.27 usec per loop
:~$ python -m timeit "reduce(lambda x,y :x+y ,[range(100),range(100)])"
100000 loops, best of 3: 2.1 usec per loop
Mazdak
  • 105,000
  • 18
  • 159
  • 188
  • how does `sum([[1,2,3],[4,5,6],[7,8,9]], [])` compare to these? – will Mar 24 '15 at 23:12
  • @will for short lists its faster than reduce but for longer lists its not! – Mazdak Mar 25 '15 at 07:02
  • 3
    @Kasramvd Awesome response! But I got confused about how a non-nested list comprehension like `[j for i in [[1,2,3],[1,2],[1,4,5,6,7]] for j in i]` could flatten the 2d list, could you give some more illustration? – YC_ Jun 09 '18 at 10:11
  • 1
    @WeisiZhan List comprehensions of this kind are usually called nested because of the nested `for` loops. In order to understand the behavior of such list comprehensions you can use a nested for lop and append all the items to a previously defined list. Like, `lst = []; for sublist in all_lists: for item in sublist: lst.append(item)` – Mazdak Jun 09 '18 at 10:41
  • @Kasramvd Thanks! :-) – YC_ Jun 10 '18 at 15:32
  • `[j for sub in [[1,2,3],[1,2],[1,4,5,6,7]] for j in sub]` This is the way :-) – Marcus Vinicius Pompeu Jan 26 '20 at 04:43
  • To me the list comprehension one doesn't _read right_. It feels something is off about it - I always seem to get it wrong and end up googling. To me this reads right `[x for x in row for row in matrix]`, but I guess I just have rewire my brain with the way it is. – Sнаđошƒаӽ Jul 12 '21 at 17:12
93

For just a list like this, my favourite neat little trick is just to use sum;

sum has an optional argument: sum(iterable [, start]), so you can do:

list_of_lists = [[1,2,3], [4,5,6], [7,8,9]]
print sum(list_of_lists, []) # [1,2,3,4,5,6,7,8,9]

this works because the + operator happens to be the concatenation operator for lists, and you've told it that the starting value is [] - an empty list.

but the documentaion for sum advises that you use itertools.chain instead, as it's much clearer.

will
  • 10,260
  • 6
  • 46
  • 69
  • how to do for list of strings ? – Pyd Nov 17 '17 at 09:34
  • 1
    @pyd the code above works for any object type... why not test before asking? – will Nov 19 '17 at 01:46
  • 3
    my_list looks like `["A",["B","C"],"D",["E","F"]]` it was not working, – Pyd Nov 19 '17 at 02:37
  • 5
    @pyd it's not working because "A" isn't a list but ["B","C"] is so when it tried to use + to concatenate them it fails (you can't add a string to a list) – Supamee Jul 19 '18 at 17:22
  • Using `+` as concatenation in `sum`, it is like a miracle. Thank you. This is great solution for any list of lists. – Alperen Aug 16 '20 at 11:23
  • This is as Pythonic as it gets. – unltd_J Apr 16 '22 at 20:17
  • @unltd_J i used to think it was pythonic, but now i much prefer readability over perl-esque line noise. nowadays i would go with something from the itertools package. – will Apr 19 '22 at 20:08
9

This will work in your particular case. A recursive function would work best if you have multiple levels of nested iterables.

def flatten(input):
    new_list = []
    for i in input:
        for j in i:
            new_list.append(j)
    return new_list
AlexMayle
  • 303
  • 2
  • 11