1

What is the fastest way to split a list into multiple sublists based on conditions? Each condition represents a separate sublist.

One way to split a listOfObjects into sublists (three sublists for demonstration, but more are possible):

listOfObjects = [.......]
l1, l2, l3 = [], [], []
for l in listOfObjects:
    if l.someAttribute == "l1":
        l1.append(l)
    elif l.someAttribute == "l2":
        l2.append(l)
    else:
        l3.append(l)

This way does not seem pythonic at all and also takes quite some time. Are there faster approaches, e.g. using map?

There is a similar question, but with only one condition, i.e., two result lists: How to split a list based on a condition?

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
CLRW97
  • 470
  • 4
  • 15

2 Answers2

1

You could collections.defaultdict here for mapping.

from collections import defaultdict

d = defaultdict(list)

for l in listOfObjects:
    d[l.someAttribute].append(l)

out = d.values() 
l1 , l2, l3 = d['l1'], d['l2'], d['l3']

d would be of the form.

{ 
  attr1 : [...],
  attr2 : [...],
  ...
  attrn : [...]
}
Ch3steR
  • 20,090
  • 4
  • 28
  • 58
  • I really like your answer since it's a readable and scalable solution. Now to the second part of my question: Could you provide some speed tests? – CLRW97 Jan 19 '22 at 10:09
  • @MichaelSzczesny I assumed it was just for demonstration. I thought OP wanted to split(group) them based on attribute value. My bad. – Ch3steR Jan 19 '22 at 10:18
  • 1
    @MichaelSzczesny I delete my answer if OP wants a default category too. – Ch3steR Jan 19 '22 at 10:18
  • @MichaelSzczesny I've undeleted my answer after OP clarified in [comments](https://stackoverflow.com/questions/70768463/fastest-way-to-split-list-into-multiple-sublists-based-on-several-conditions/70768539#comment125108478_70768463) that they want *"one separate list per category"*. – Ch3steR Jan 19 '22 at 10:37
1

That similar question's answer is amazing. I haven't thought about that for splitting... Anyway, you can do something similar but it would be less readable:

for l in listOfObjects:
    (l3, l2, l1)[(l.someAttribute == "l1")*2 or l.someAttribute == "l2"].append(l)

This will work for any boolean conditions. or returns first truthy value (or False). True==1, so we add *2 for the index that we want to be equal to 2.

But as I said, it's not really readable. And not scalable.

As for speed: or is short-circuiting, returns first truthy value, so the check of conditions should be similar to your approach. You might want to keep the lookup tuple defined outside of the loop.


And more readable thing using dict because your conditions are based on equality (note: attribute you want also has to be hashable)

lookup = {"l1": l1, "l2": l2}
for l in listOfObjects:
    lookup.get(l.someAttribute, l3).append(l)

dict.get gets default value as second - so it's perfect for our else catchall.

In terms of speed: Dictionary lookup will have only one check, as opposed to a chain of or conditions of chain of ifs

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
h4z3
  • 5,265
  • 1
  • 15
  • 29
  • 2
    OP mentioned *"three sublists for demonstration, but **more are possible**"* scaling up would be very difficult with the above approach. – Ch3steR Jan 19 '22 at 10:06
  • That's why there's also dict approach in there. ;) – h4z3 Jan 19 '22 at 10:09
  • 1
    How would you scale your dictionary version? Unless you know all the attribute values beforehand? – Ch3steR Jan 19 '22 at 10:11
  • OP demonstrated ifs with equality, suggesting they know the values from somewhere.+ considering their use of `else`, I assumed they want to know only some of them and then a catchall for the rest. If they don't want the catchall, then yes, defaultdict is better. | But even if we come to the conclusion that it's not the best solution for this case, I'll leave my answer in there, in case people with similar problems come in here. :) – h4z3 Jan 19 '22 at 10:21
  • Got you. :) The question is unclear whether OP wants the "default" category. I hope they clarify the question. – Ch3steR Jan 19 '22 at 10:25