Move all `None` to the end of a list (python3)

Question

Given a list like

[None, None, 90, 10, None, 34, None, 108]

What's an easy way to move all Nones to the end without disturbing the order of the other elements? So, you'd get something like

[90, 10, 34, 108, None, None, None, None]

One possible solution is using filter with None and then just appending Nones at the end:

In [890]: l2 = list(filter(None, l))

In [891]: l2 += [None] * (len(l) - len(l2))

In [892]: l2
Out[892]: [90, 10, 34, 108, None, None, None, None]

However,

filter(None, ...) doesn't mean "filter out Nones"; None is special-cased to the identity predicate, so this will also filter out 0, False, and anything else that returns False from __bool__.

So, this isn't a good solution. Is there anything better?

You should be more than able to do the translation of “moving X to the beginning” to “moving Y to the end”. Also, please do not use your gold badge to reopen your own question without other’s support. — poke, Aug 03 '17 at 08:05
I don't see anything in this question or the proposed solutions that is inextricably bound to the version of Python. I'm with @poke. — PaulMcG, Aug 03 '17 at 08:08
And that matters why? There is nothing version specific about this at all. Some of the answers here even assume Python 2. — poke, Aug 03 '17 at 08:08

Massimiliano · Answer 1 · 2017-08-03T07:29:19.497

8

This will preserve the sorting of the list:

>>> l = [None, None, 90, 10, None, 34, None, 108]
>>> sorted(l, key = lambda x: x is None)
[90, 10, 34, 108, None, None, None, None]

because, quoting python docs:

The built-in sorted() function is guaranteed to be stable. A sort is stable if it guarantees not to change the relative order of elements that compare equal — this is helpful for sorting in multiple passes (for example, sort by department, then by salary grade).

edited Aug 03 '17 at 07:29

answered Aug 03 '17 at 05:33

Massimiliano

7,842
2
47
62

Plus1. Nice. I initially tried this but could't solve it. – Rahul Aug 03 '17 at 05:54
_"This should preserve the sorting"_ This _will_ preserve the order of the list elements. From [the docs](https://docs.python.org/3.5/library/functions.html#sorted): "The built-in sorted() function is guaranteed to be stable." – Aran-Fey Aug 03 '17 at 07:25
@Rawing Thanks, updated the post – Massimiliano Aug 03 '17 at 07:29
1

Note that this solution is still O(n log n) though since sorting doesn’t get cheaper with a simpler key function. – poke Aug 03 '17 at 07:30

Rahul · Accepted Answer · 2017-08-03T06:49:52.477

5

I don't know Big O but this:

l2 = [x for x in l if x is not None]
l2 += [None] * l.count(None)

just works, and

l2.extend([None] * l.count(None))

is loved by many pythonists and

l2 += [None] * (len(l) - len(l2))

is better.

edited Aug 03 '17 at 06:49

answered Aug 03 '17 at 05:27

Rahul

10,830
4
53
88

I would recommend changing the last line into something like `l2.extend(None for x in range(none_count)`. The generator expression would be faster and less memory intensive since you don't have to allocate a separate array. – Zizouz212 Aug 03 '17 at 05:31
@Zizouz212 :Done it. – Rahul Aug 03 '17 at 05:32
`l = [x for x in l if x is not None] + [x for x in l if x is None]` – DragonBobZ Aug 03 '17 at 05:32
1

@Zizouz212: Generator expressions usually aren't faster, actually. Their benefits are memory consumption and the ability to stop them early. – user2357112 Aug 03 '17 at 05:33
@user2357112 I think it just depends on the use case. Most of my cases have always had them faster by a few seconds, but I think that has to do with allocation rather than the actual time to execute. – Zizouz212 Aug 03 '17 at 05:36
I preferred `l2.extend([None] * l.count(None))` to the current solution actually. But this is quite nice, (better than the sorted version). – cs95 Aug 03 '17 at 05:47
This works, although it does two linear scans over `l` twice; that also applies to DragonBobZ's suggestion. Using a traditional `for` loop you can do it in a single scan. – PM 2Ring Aug 03 '17 at 06:02
Original idea iterates twice (once for filtering the list, once for counting the `None`s). Last one only iterates once and `len(list)` is constant time, so this is very efficient. Time complexity is O(n) for both though (at least ignoring the list operations). – poke Aug 03 '17 at 07:28

user2357112 · Answer 3 · 2017-08-03T07:38:11.647

5

I'd build a list without Nones and pad it:

l2 = [x for x in l if x is not None]
l2 += [None] * (len(l) - len(l2))

Alternatively, there's a one-liner. I think it's theoretically O(nlogn), but I haven't observed that in practice - I think Timsort's galloping mode reduces the comparisons involved to something closer to but not quite O(n), and the data movement costs have too low a constant factor to dominate the runtime even if they're O(nlogn). In practice, the bigger disadvantage is probably that you have to think about how booleans are ordered:

sorted(l, key=lambda x: x is None)

edited Aug 03 '17 at 07:38

answered Aug 03 '17 at 05:31

user2357112

260,549
28
431
505

Who downvoted this? It works perfectly. – cs95 Aug 03 '17 at 05:35
Also, I see I was on the right track with the padding. – cs95 Aug 03 '17 at 05:36
Why is this O(nlogn)? I think this is just O(n), and very pythonic – user2341726 Aug 03 '17 at 05:38
`lambda`s are not pythonic. – Rahul Aug 03 '17 at 05:39
@user2341726: `sorted` is O(nlogn) in general, and I don't think this hits any of the cases where TimSort gets to be O(n). – user2357112 Aug 03 '17 at 05:40
I'm not sure why an answer posted after this got more upvotes than this did though they both say the same thing, or the fact that 5 people attempted to answer this, but this question has no upvotes. – cs95 Aug 03 '17 at 05:44
Experiment suggests that this may be hitting one of the O(n) cases, or perhaps the overhead of the key computation is dominating the runtime. – user2357112 Aug 03 '17 at 05:48
@cᴏʟᴅsᴘᴇᴇᴅ The version using `sorted` is faster than you might expect because TimSort is optimised for lists that contain subsequences that are already sorted. – PM 2Ring Aug 03 '17 at 05:58
@user2357112 but you don't need to sort, the new list is generated by the ord r of old list – user2341726 Aug 03 '17 at 06:03
@PM2Ring: Yeah, but my experiments don't have significant sorted subsequences. – user2357112 Aug 03 '17 at 06:13
@user2341726: The `sorted` version is performing a full stable sort of the key values. – user2357112 Aug 03 '17 at 06:13

cdlane · Answer 4 · 2017-08-07T22:19:46.647

A solution to keep you up at night:

>>> example = [None, None, 90, 10, None, 34, None, 108]
>>> 
>>> def some(a, *b):
...     return ([*some(*b), a] if a is None else [a, *some(*b)]) if b else [a]
... 
>>> print(some(*example))
[90, 10, 34, 108, None, None, None, None]

Explanation

We pass the the data into our function using *example so that the elements of the list become arguments. We then segment these arguments at function call using (a, *b) to place the first argument in a and the rest in b. Recursively, if a is None then the answer is:

[*some(*b), a]

which moves a to the end. Otherwise, the answer is:

[a, *some(*b)]

which keeps a in front and processes the rest of the list.

Finally, we have the if b else [a] bit which is just the base case of our recursion.

Bake-Off

We've enough solutions that it's time for a bake-off. (Let me know if I've misrepresented your solution.) We'll be summing up ten timings of a 10,000 element list in which 30% of the elements are None. I've eliminated zeros from the data so that everyone can play:

import sys
import timeit
import random

A = [random.randint(1, 1000) if random.random() > 0.3 else None for _ in range(10000)]  # ~30% None

def Rahul_comprehension(A):

    B = [x for x in A if x is not None]
    B += [None] * A.count(None)

    return B

def coldspeed_filter(A):

    B = list(filter(None, A))
    B += [None] * (len(A) - len(B))

    return B

def user2357112_comprehension(A):

    B = [x for x in A if x is not None]
    B += [None] * (len(A) - len(B))

    return B

sys.setrecursionlimit(100000)
def cdlane_recursion(A):

    def some(a, *b):
        return ([*some(*b), a] if a is None else [a, *some(*b)]) if b else [a]

    return some(*A)

from functools import reduce
def PaulMcG_reduce(A):

    return sum(reduce(lambda a,b: a[b is None].append(b) or a, A, ([], [])), [])

from itertools import zip_longest
def PaulMcG_zip_longest(A):

    non_nones = filter(lambda x: x is not None, A)
    return next(zip(*zip_longest(non_nones, A)))

def Massimiliano_sorted(A):

    return sorted(A, key=lambda x: x is None)

# sanity check to make sure everyone agrees on the answer
B = Rahul_comprehension(A)  # the accepted answer
assert B == coldspeed_filter(A)
assert B == user2357112_comprehension(A)
assert B == cdlane_recursion(A)
assert B == PaulMcG_reduce(A)
assert B == list(PaulMcG_zip_longest(A))
assert B == Massimiliano_sorted(A)

print("Rahul_comprehension:", format(timeit.timeit('B = Rahul_comprehension(A)', number=10, globals=globals()), ".2"))
print("coldspeed_filter:", format(timeit.timeit('B = coldspeed_filter(A)', number=10, globals=globals()), ".2"))
print("user2357112_comprehension:", format(timeit.timeit('B = user2357112_comprehension(A)', number=10, globals=globals()), ".2"))
print("cdlane_recursion:", format(timeit.timeit('B = cdlane_recursion(A)', number=10, globals=globals()), ".2"))
print("PaulMcG_reduce:", format(timeit.timeit('B = PaulMcG_reduce(A)', number=10, globals=globals()), ".2"))
print("PaulMcG_zip_longest:", format(timeit.timeit('B = PaulMcG_zip_longest(A)', number=10, globals=globals()), ".2"))
print("Massimiliano_sorted:", format(timeit.timeit('B = Massimiliano_sorted(A)', number=10, globals=globals()), ".2"))

Sorted Typical Best Runs: (lowest is best)

coldspeed (filter):          0.002
user2357112 (comprehension): 0.0041
Rahul (comprehension):       0.0061
PaulMcG (zip_longest):       0.024
PaulMcG (reduce):            0.025
Massimiliano (sorted):       0.026
cdlane (recursion):          5.8

Are you going to fly away on your magical unicorn or are you going to explain how this works? :p — cs95, Aug 03 '17 at 06:32
This has the flaw of treating all falsy objects the same as None, just like cᴏʟᴅsᴘᴇᴇᴅ's original `filter`. — user2357112, Aug 03 '17 at 06:41
`not a` should be `a is None` to allow for falsey non-None values (like 0). — PaulMcG, Aug 03 '17 at 06:42
@PaulMcG, already detected and fixed while writing the requested explanation -- thanks! — cdlane, Aug 03 '17 at 06:45
Probably better to use the ternary `x if condition else y` form than your `[false_path, true_path][condition]` form. Otherwise you call `some` recursively to build both list values and then throw away the one you don't want. Since each call then further calls recursively, this ends up being a *lot* of function calls. Although this will also serve to deobfuscate your code a bit, which may have been part of your original intent. But very clever solution! — PaulMcG, Aug 03 '17 at 06:52
@PaulMcG, that double evaluation bothers me too so I've unwound it to a ternary conditional and updated my explanation. Efficiency takes all the mystery out of programming... — cdlane, Aug 03 '17 at 06:59

PaulMcG · Answer 5 · 2017-08-03T06:29:30.460

This solution uses itertools.zip_longest to pad the shorter of two sequences with Nones. The first sequence is the non-None items from the original list, the second is the full original. zip_longest will return tuples of items from each sequence, padding with None for the shorter of the two. Then we re-zip those tuples to get back sequences, and take the first sequence, representing the filtered sequence padded with as many Nones as needed to be as long as the original.

>>> from itertools import zip_longest
>>>
>>> seq = [None, None, 90, 10, None, 34, None, 108]
>>>
>>> non_nones = filter(lambda x: x is not None, seq)
>>>
>>> reordered = next(zip(*zip_longest(non_nones, seq)))
>>>
>>> print(reordered)
(90, 10, 34, 108, None, None, None, None)

PaulMcG · Answer 6 · 2017-08-03T07:51:27.603

@cdlane's solution made me think of using reduce:

>>> seq = [None, None, 90, 10, None, 34, None, 108]
>>>
>>> from functools import reduce
>>> reordered = sum(reduce(lambda a,b: (a[0], a[1]+(b,)) 
                                       if b is None else (a[0]+(b,), a[1]), 
                           seq, ((), ())), ())
>>> print(reordered)
(90, 10, 34, 108, None, None, None, None)

EDIT: I couldn't sleep thinking about all those tuple additions, this is better:

>>>> reordered = sum(reduce(lambda a,b: a[b is None].append(b) or a, seq, ([], [])), [])

Move all `None` to the end of a list (python3)

6 Answers6