1

My goal

My question is about a list comprehension that does not puts elements in the resulting list as they are (which would results in a nested list), but extends the results into a flat list. So my question is not about flattening a nested list, but how to get a flat list while avoiding to make a nested list in the first place.

Example

Consider a have class instances with attributes that contains a list of integers:

class Foo:
    def __init__(self, l):
        self.l = l

foo_0 = Foo([1, 2, 3])
foo_1 = Foo([4, 5])
list_of_foos = [foo_0, foo_1]

Now I want to have a list of all integers in all instances of Foo. My best solution using extend is:

result = []
for f in list_of_foos:
    result.extend(f.l)

As desired, result is now [1, 2, 3, 4, 5].

Is there something better? For example list comprehensions?

Since I expect list comprehension to be faster, I'm looking for pythonic way get the desired result with a list comprehension. My best approach is to get a list of lists ('nested list') and flatten this list again - which seems quirky:

result = [item for sublist in [f.l for f in list_of_foos] for item in sublist]

What functionaly I'm looking for

result = some_module.list_extends(f.l for f in list_of_foos)

Questions and Answers I read before

I was quite sure there is an answer to this problem, but during my search, I only found list.extend and list comprehension where the reason why a nested list occurs is different; and python list comprehensions; compressing a list of lists? where the answers are about avoiding the nested list, or how to flatten it.

Qaswed
  • 3,649
  • 7
  • 27
  • 47

3 Answers3

1

You can use multiple fors in a single comprehension:

result = [
    n
    for foo in list_of_foos
    for n in foo.l
]

Note that the order of fors is from the outside in -- same as if you wrote a nested for-loop:

for foo in list_of_foos:
    for n in foo.l:
        print(n)
Qaswed
  • 3,649
  • 7
  • 27
  • 47
matejcik
  • 1,912
  • 16
  • 26
  • 1
    Hi, Thanks! `for n in sublist` does not work, since `sublist` is an an instance of class `Foo` and no list. But if you change it to `for n in sublist.l` it works and I'll accept your answer. Or for better readability `[n for foo in list_of_foos for n in foo.l]` – Qaswed Aug 01 '22 at 15:43
0

If you want to combine multiple lists, as if they were all one list, I'd immediately think of itertools.chain. However, you have to access an attribute on each item, so we're also going to need operator.attrgetter. To get those together, I used map and itertools.chain.from_iterable()

https://docs.python.org/3/library/itertools.html#itertools.chain.from_iterable

from itertools import chain
from operator import attrgetter

class Foo:
    def __init__(self, l):
        self.l = l

foo_0 = Foo([1, 2, 3])
foo_1 = Foo([4, 5])
list_of_foos = [foo_0, foo_1]
for item in chain.from_iterable(map(attrgetter('l'), list_of_foos)):
    print(item)

That demonstrates iterating through iterators with chain, as if they were one. If you don't specifically need to keep the list around, don't. But in case you do, here is the comprehension:

final = [item for item in chain.from_iterable(map(attrgetter('l'), list_of_foos))]
print(final)

[1, 2, 3, 4, 5]

Kenny Ostrom
  • 5,639
  • 2
  • 21
  • 30
0

In a list, you can make good use to + operator to concatenate two or more list together. It acts like an extend function to your list.

foo_0.l + foo_1.l
Out[7]: [1, 2, 3, 4, 5]

or you can use sum to perform this operation

sum([foo_0.l, foo_1.l], [])
Out[15]: [1, 2, 3, 4, 5]

In fact, it's in one of the post you have read ;)

  • Hi, in my real world situation I do not have 2 class instances but hundreds/thousands. – Qaswed Aug 01 '22 at 16:38
  • with `sum( ..., [])` you introduce a quadratic bottle neck in the program, check the doc of `sum` to know what is the recommended way to chain lists – cards Aug 01 '22 at 16:39
  • @cards thanks for pointing out! Just realized this after doing some research when you mentioned it. :D –  Aug 01 '22 at 17:06
  • Nevertheless, interestingly, I tried with 3million values with 2 millions values and do the sum, and I compare it with @matejck's answer. It is faster by triple. –  Aug 01 '22 at 17:14
  • @KevinChoonLiangYew Maybe my initial comment was not good enough. So would you recommend `sum([foo.l for foo in list_of_foos], [])` in my real world case where I have not only foo_0, foo_1 but mybe up to foo_3000 (which I wont type out by hand)? – Qaswed Aug 02 '22 at 11:59
  • If you have created a `list_of_foos`, yes I was recommending it. I did few more research on this method, apparently, it's the fastest among the methods here. But of course, you can always try both the methods on their performance and see whichever you prefer. https://chrisconlan.com/fastest-way-to-flatten-a-list-in-python/ –  Aug 02 '22 at 14:28
  • @KevinChoonLiangYew As my question is about `list_of_foos` (and not only a few foos), I would like appreciate and upvote your answer, if you make your answer more general for the `list_of_foos` :) – Qaswed Aug 04 '22 at 12:44