2

Given a list like

[None, None, 90, 10, None, 34, None, 108]

What's an easy way to move all Nones to the end without disturbing the order of the other elements? So, you'd get something like

[90, 10, 34, 108, None, None, None, None]

One possible solution is using filter with None and then just appending Nones at the end:

In [890]: l2 = list(filter(None, l))

In [891]: l2 += [None] * (len(l) - len(l2))

In [892]: l2
Out[892]: [90, 10, 34, 108, None, None, None, None]

However,

filter(None, ...) doesn't mean "filter out Nones"; None is special-cased to the identity predicate, so this will also filter out 0, False, and anything else that returns False from __bool__.

So, this isn't a good solution. Is there anything better?

cs95
  • 379,657
  • 97
  • 704
  • 746
  • 1
    You should be more than able to do the translation of “moving X to the beginning” to “moving Y to the end”. Also, please do not use your gold badge to reopen your own question without other’s support. – poke Aug 03 '17 at 08:05
  • @poke It's not even the same Python version. – cs95 Aug 03 '17 at 08:06
  • 1
    I don't see anything in this question or the proposed solutions that is inextricably bound to the version of Python. I'm with @poke. – PaulMcG Aug 03 '17 at 08:08
  • And that matters why? There is nothing version specific about this at all. Some of the answers here even assume Python 2. – poke Aug 03 '17 at 08:08

6 Answers6

8

This will preserve the sorting of the list:

>>> l = [None, None, 90, 10, None, 34, None, 108]
>>> sorted(l, key = lambda x: x is None)
[90, 10, 34, 108, None, None, None, None]

because, quoting python docs:

The built-in sorted() function is guaranteed to be stable. A sort is stable if it guarantees not to change the relative order of elements that compare equal — this is helpful for sorting in multiple passes (for example, sort by department, then by salary grade).

Massimiliano
  • 7,842
  • 2
  • 47
  • 62
  • Plus1. Nice. I initially tried this but could't solve it. – Rahul Aug 03 '17 at 05:54
  • _"This should preserve the sorting"_ This _will_ preserve the order of the list elements. From [the docs](https://docs.python.org/3.5/library/functions.html#sorted): "The built-in sorted() function is guaranteed to be stable." – Aran-Fey Aug 03 '17 at 07:25
  • @Rawing Thanks, updated the post – Massimiliano Aug 03 '17 at 07:29
  • 1
    Note that this solution is still O(n log n) though since sorting doesn’t get cheaper with a simpler key function. – poke Aug 03 '17 at 07:30
5

I don't know Big O but this:

l2 = [x for x in l if x is not None]
l2 += [None] * l.count(None)

just works, and

l2.extend([None] * l.count(None))

is loved by many pythonists and

l2 += [None] * (len(l) - len(l2))

is better.

Rahul
  • 10,830
  • 4
  • 53
  • 88
  • I would recommend changing the last line into something like `l2.extend(None for x in range(none_count)`. The generator expression would be faster and less memory intensive since you don't have to allocate a separate array. – Zizouz212 Aug 03 '17 at 05:31
  • @Zizouz212 :Done it. – Rahul Aug 03 '17 at 05:32
  • `l = [x for x in l if x is not None] + [x for x in l if x is None]` – DragonBobZ Aug 03 '17 at 05:32
  • 1
    @Zizouz212: Generator expressions usually aren't faster, actually. Their benefits are memory consumption and the ability to stop them early. – user2357112 Aug 03 '17 at 05:33
  • @user2357112 I think it just depends on the use case. Most of my cases have always had them faster by a few seconds, but I think that has to do with allocation rather than the actual time to execute. – Zizouz212 Aug 03 '17 at 05:36
  • I preferred `l2.extend([None] * l.count(None))` to the current solution actually. But this is quite nice, (better than the sorted version). – cs95 Aug 03 '17 at 05:47
  • This works, although it does two linear scans over `l` twice; that also applies to DragonBobZ's suggestion. Using a traditional `for` loop you can do it in a single scan. – PM 2Ring Aug 03 '17 at 06:02
  • Original idea iterates twice (once for filtering the list, once for counting the `None`s). Last one only iterates once and `len(list)` is constant time, so this is very efficient. Time complexity is O(n) for both though (at least ignoring the list operations). – poke Aug 03 '17 at 07:28
5

I'd build a list without Nones and pad it:

l2 = [x for x in l if x is not None]
l2 += [None] * (len(l) - len(l2))

Alternatively, there's a one-liner. I think it's theoretically O(nlogn), but I haven't observed that in practice - I think Timsort's galloping mode reduces the comparisons involved to something closer to but not quite O(n), and the data movement costs have too low a constant factor to dominate the runtime even if they're O(nlogn). In practice, the bigger disadvantage is probably that you have to think about how booleans are ordered:

sorted(l, key=lambda x: x is None)
user2357112
  • 260,549
  • 28
  • 431
  • 505
3

A solution to keep you up at night:

>>> example = [None, None, 90, 10, None, 34, None, 108]
>>> 
>>> def some(a, *b):
...     return ([*some(*b), a] if a is None else [a, *some(*b)]) if b else [a]
... 
>>> print(some(*example))
[90, 10, 34, 108, None, None, None, None]

Explanation

We pass the the data into our function using *example so that the elements of the list become arguments. We then segment these arguments at function call using (a, *b) to place the first argument in a and the rest in b. Recursively, if a is None then the answer is:

[*some(*b), a]

which moves a to the end. Otherwise, the answer is:

[a, *some(*b)]

which keeps a in front and processes the rest of the list.

Finally, we have the if b else [a] bit which is just the base case of our recursion.

Bake-Off

We've enough solutions that it's time for a bake-off. (Let me know if I've misrepresented your solution.) We'll be summing up ten timings of a 10,000 element list in which 30% of the elements are None. I've eliminated zeros from the data so that everyone can play:

import sys
import timeit
import random

A = [random.randint(1, 1000) if random.random() > 0.3 else None for _ in range(10000)]  # ~30% None

def Rahul_comprehension(A):

    B = [x for x in A if x is not None]
    B += [None] * A.count(None)

    return B

def coldspeed_filter(A):

    B = list(filter(None, A))
    B += [None] * (len(A) - len(B))

    return B

def user2357112_comprehension(A):

    B = [x for x in A if x is not None]
    B += [None] * (len(A) - len(B))

    return B

sys.setrecursionlimit(100000)
def cdlane_recursion(A):

    def some(a, *b):
        return ([*some(*b), a] if a is None else [a, *some(*b)]) if b else [a]

    return some(*A)

from functools import reduce
def PaulMcG_reduce(A):

    return sum(reduce(lambda a,b: a[b is None].append(b) or a, A, ([], [])), [])

from itertools import zip_longest
def PaulMcG_zip_longest(A):

    non_nones = filter(lambda x: x is not None, A)
    return next(zip(*zip_longest(non_nones, A)))

def Massimiliano_sorted(A):

    return sorted(A, key=lambda x: x is None)

# sanity check to make sure everyone agrees on the answer
B = Rahul_comprehension(A)  # the accepted answer
assert B == coldspeed_filter(A)
assert B == user2357112_comprehension(A)
assert B == cdlane_recursion(A)
assert B == PaulMcG_reduce(A)
assert B == list(PaulMcG_zip_longest(A))
assert B == Massimiliano_sorted(A)

print("Rahul_comprehension:", format(timeit.timeit('B = Rahul_comprehension(A)', number=10, globals=globals()), ".2"))
print("coldspeed_filter:", format(timeit.timeit('B = coldspeed_filter(A)', number=10, globals=globals()), ".2"))
print("user2357112_comprehension:", format(timeit.timeit('B = user2357112_comprehension(A)', number=10, globals=globals()), ".2"))
print("cdlane_recursion:", format(timeit.timeit('B = cdlane_recursion(A)', number=10, globals=globals()), ".2"))
print("PaulMcG_reduce:", format(timeit.timeit('B = PaulMcG_reduce(A)', number=10, globals=globals()), ".2"))
print("PaulMcG_zip_longest:", format(timeit.timeit('B = PaulMcG_zip_longest(A)', number=10, globals=globals()), ".2"))
print("Massimiliano_sorted:", format(timeit.timeit('B = Massimiliano_sorted(A)', number=10, globals=globals()), ".2"))

Sorted Typical Best Runs: (lowest is best)

coldspeed (filter):          0.002
user2357112 (comprehension): 0.0041
Rahul (comprehension):       0.0061
PaulMcG (zip_longest):       0.024
PaulMcG (reduce):            0.025
Massimiliano (sorted):       0.026
cdlane (recursion):          5.8
cdlane
  • 40,441
  • 5
  • 32
  • 81
  • Are you going to fly away on your magical unicorn or are you going to explain how this works? :p – cs95 Aug 03 '17 at 06:32
  • This has the flaw of treating all falsy objects the same as None, just like cᴏʟᴅsᴘᴇᴇᴅ's original `filter`. – user2357112 Aug 03 '17 at 06:41
  • `not a` should be `a is None` to allow for falsey non-None values (like 0). – PaulMcG Aug 03 '17 at 06:42
  • @PaulMcG, already detected and fixed while writing the requested explanation -- thanks! – cdlane Aug 03 '17 at 06:45
  • Probably better to use the ternary `x if condition else y` form than your `[false_path, true_path][condition]` form. Otherwise you call `some` recursively to build both list values and then throw away the one you don't want. Since each call then further calls recursively, this ends up being a *lot* of function calls. Although this will also serve to deobfuscate your code a bit, which may have been part of your original intent. But very clever solution! – PaulMcG Aug 03 '17 at 06:52
  • @PaulMcG, that double evaluation bothers me too so I've unwound it to a ternary conditional and updated my explanation. Efficiency takes all the mystery out of programming... – cdlane Aug 03 '17 at 06:59
  • This looks extremely slow with all the list creation – user2341726 Aug 03 '17 at 09:43
1

This solution uses itertools.zip_longest to pad the shorter of two sequences with Nones. The first sequence is the non-None items from the original list, the second is the full original. zip_longest will return tuples of items from each sequence, padding with None for the shorter of the two. Then we re-zip those tuples to get back sequences, and take the first sequence, representing the filtered sequence padded with as many Nones as needed to be as long as the original.

>>> from itertools import zip_longest
>>>
>>> seq = [None, None, 90, 10, None, 34, None, 108]
>>>
>>> non_nones = filter(lambda x: x is not None, seq)
>>>
>>> reordered = next(zip(*zip_longest(non_nones, seq)))
>>>
>>> print(reordered)
(90, 10, 34, 108, None, None, None, None)
PaulMcG
  • 62,419
  • 16
  • 94
  • 130
1

@cdlane's solution made me think of using reduce:

>>> seq = [None, None, 90, 10, None, 34, None, 108]
>>>
>>> from functools import reduce
>>> reordered = sum(reduce(lambda a,b: (a[0], a[1]+(b,)) 
                                       if b is None else (a[0]+(b,), a[1]), 
                           seq, ((), ())), ())
>>> print(reordered)
(90, 10, 34, 108, None, None, None, None)

EDIT: I couldn't sleep thinking about all those tuple additions, this is better:

>>>> reordered = sum(reduce(lambda a,b: a[b is None].append(b) or a, seq, ([], [])), [])
PaulMcG
  • 62,419
  • 16
  • 94
  • 130