5

I have a list like below:

a = [4, 5, 0, 0, 6, 7, 0, 1, 0, 5]

and I want to push all zeroes to the beginning of that list. The result must be like below.

a = [0, 0, 0, 0, 4, 5, 6, 7, 1, 5]

How to do it in Python 2?

vaultah
  • 44,105
  • 12
  • 114
  • 143
namco
  • 6,208
  • 20
  • 59
  • 83

4 Answers4

16

You could sort the list:

a.sort(key=lambda v: v != 0)

The key function tells Python to sort values by wether or not they are 0. False is sorted before True, and values are then sorted based on their original relative position.

For 0, False is returned, sorting all those values first. For the rest True is returned, leaving sort to put them last but leave their relative positions untouched.

Demo:

>>> a = [4, 5, 0, 0, 6, 7, 0, 1, 0, 5]
>>> a.sort(key=lambda v: v != 0)
>>> a
[0, 0, 0, 0, 4, 5, 6, 7, 1, 5]
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • 2
    Is there any guarantee that the sort is stable? The language docs don't mention it as far as I can tell. – viraptor May 24 '14 at 12:11
  • @viraptor: yes, the sort is guaranteed to be stable. – Martijn Pieters May 24 '14 at 12:14
  • 1
    @viraptor: see [Is python's sorted() function guaranteed to be stable?](http://stackoverflow.com/q/1915376) – Martijn Pieters May 24 '14 at 12:15
  • I like the simplicity of this approach, but for larger lists keep in mind that sorting algorithms usually have computational time complexity __n ln n__. Moving all zeros to the end of list should be possible in linear time (__n__) - see other answers. – Messa May 24 '14 at 12:17
  • @Messa: note that that solution loops through the list *twice* though. Once for the `a.count(0)` call, a second for the filtered list comprehension. The cost of the list multiplication and the concatenations also need to be factored in. In *practice*, I expect the sort to win. – Martijn Pieters May 24 '14 at 12:21
11

This can be done without sorting.

Solutions

Initialization:

In [8]: a = [4, 5, 0, 0, 6, 7, 0, 1, 0, 5]

In [9]: from itertools import compress, repeat, chain

list.count and itertools.compress

In [10]: x = [0] * a.count(0); x.extend(compress(a, a))

In [11]: x
Out[11]: [0, 0, 0, 0, 4, 5, 6, 7, 1, 5]

Same as before, but without list.count

In [12]: c = list(compress(a, a)); [0] * (len(a) - len(c)) + c
Out[12]: [0, 0, 0, 0, 4, 5, 6, 7, 1, 5]

list.count, itertools.compress, itertools.repeat, itertools.chain

In [13]: list(chain(repeat(0, a.count(0)), compress(a, a)))
Out[13]: [0, 0, 0, 0, 4, 5, 6, 7, 1, 5]

Same as the previous one, but without list.count

In [14]: c = list(compress(a, a)); list(chain(repeat(0, len(a) - len(c)), c))
Out[14]: [0, 0, 0, 0, 4, 5, 6, 7, 1, 5]

Benchmarks

For small lists:

In [21]: %timeit x = [0] * a.count(0); x.extend(compress(a, a))
1000000 loops, best of 3: 583 ns per loop

In [22]: %timeit c = list(compress(a, a)); [0] * (len(a) - len(c)) + c
1000000 loops, best of 3: 661 ns per loop

In [23]: %timeit list(chain(repeat(0, a.count(0)), compress(a, a)))
1000000 loops, best of 3: 762 ns per loop

In [24]: %timeit c = list(compress(a, a)); list(chain(repeat(0, len(a) - len(c)), c))
1000000 loops, best of 3: 900 ns per loop

For large lists:

In [28]: a *= 10000000

In [29]: %timeit x = [0] * a.count(0); x.extend(compress(a, a))
1 loops, best of 3: 1.43 s per loop

In [30]: %timeit c = list(compress(a, a)); [0] * (len(a) - len(c)) + c
1 loops, best of 3: 1.37 s per loop

In [31]: %timeit list(chain(repeat(0, a.count(0)), compress(a, a)))
1 loops, best of 3: 1.79 s per loop

In [32]: %timeit c = list(compress(a, a)); list(chain(repeat(0, len(a) - len(c)), c))
1 loops, best of 3: 1.47 s per loop

As you can see, in some cases itertools-based solutions tend to be slower, because of the big number of function calls.

vaultah
  • 44,105
  • 12
  • 114
  • 143
  • 1
    What does it do when the list contains `None` or `False`? :) – Messa May 24 '14 at 12:21
  • 2
    `None` and `False` are falsy, so all of them will be replaced with zeros. But the original list contains only numbers, so I don't think this a problem. – vaultah May 24 '14 at 12:42
  • 2
    Another option would be `x = [0] * a.count(0); x += compress(a, a)`, which happens to be faster in all cases on my machine. That said, your benchmark isn't really fair since `sorted(a, ...)` is slower than `a.sort(...)` due to the additional copy of the list, and you would use `key=bool` instead of `key=lambda x: x != 0` if you were trying to optimise for speed. – Sven Marnach May 30 '14 at 10:51
1

Here are some better timings, with two new methods:

SETUP="
from itertools import compress, repeat, chain
a = [4, 5, 0, 0, 6, 7, 0, 1, 0, 5]
"

First the sorting:

python -m timeit -s "$SETUP" "a.sort(key=bool)"
# 1000000 loops, best of 3: 1.51 usec per loop

Then frostnational's methods:

python -m timeit -s "$SETUP" "list(chain(repeat(0, a.count(0)), compress(a, a)))"
# 1000000 loops, best of 3: 1.16 usec per loop

python -m timeit -s "$SETUP" "cs = list(compress(a, a)); list(chain(repeat(0, len(a)-len(cs)), cs))"
# 1000000 loops, best of 3: 1.37 usec per loop

Then methods more directly working from lists:

python -m timeit -s "$SETUP" "[0] * a.count(0) + list(filter(bool, a))"
# 1000000 loops, best of 3: 1.04 usec per loop

python -m timeit -s "$SETUP" "nonzero = list(filter(bool, a)); [0] * (len(a)-len(nonzero)) + nonzero"
# 1000000 loops, best of 3: 0.87 usec per loop

And again with larger sized input:

SETUP="
from itertools import compress, repeat, chain
a = [4, 5, 0, 0, 6, 7, 0, 1, 0, 5] * 1000
"

Sorting:

python -m timeit -s "$SETUP" "a.sort(key=bool)"
# 1000 loops, best of 3: 1.08 msec per loop

frostnational's:

python -m timeit -s "$SETUP" "list(chain(repeat(0, a.count(0)), compress(a, a)))"
# 1000 loops, best of 3: 333 usec per loop

python -m timeit -s "$SETUP" "cs = list(compress(a, a)); list(chain(repeat(0, len(a)-len(cs)), cs))"
# 1000 loops, best of 3: 206 usec per loop

New:

python -m timeit -s "$SETUP" "[0] * a.count(0) + list(filter(bool, a))"
# 1000 loops, best of 3: 295 usec per loop

python -m timeit -s "$SETUP" "nonzero = list(filter(bool, a)); [0] * (len(a)-len(nonzero)) + nonzero"
# 10000 loops, best of 3: 143 usec per loop

That said, despite being relatively slow, Martijn Pieters' decision to use sorting is actually pretty competitive for reasonably-sized lists and premature optimisation is the root of all evil.


FWIW, here are some timings for very long lists:

SETUP="
from itertools import compress, repeat, chain
a = [4, 5, 0, 0, 6, 7, 0, 1, 0, 5] * 1000000
"
python -m timeit -s "$SETUP" "a.sort(key=bool)"
# 10 loops, best of 3: 1.21 sec per loop

python -m timeit -s "$SETUP" "list(chain(repeat(0, a.count(0)), compress(a, a)))"
# 10 loops, best of 3: 347 msec per loop

python -m timeit -s "$SETUP" "cs = list(compress(a, a)); list(chain(repeat(0, len(a)-len(cs)), cs))"
# 10 loops, best of 3: 226 msec per loop

python -m timeit -s "$SETUP" "[0] * a.count(0) + list(filter(bool, a))"
# 10 loops, best of 3: 310 msec per loop

python -m timeit -s "$SETUP" "nonzero = list(filter(bool, a)); [0] * (len(a)-len(nonzero)) + nonzero"
# 10 loops, best of 3: 153 msec per loop
Veedrac
  • 58,273
  • 15
  • 112
  • 169
0

Try this simple process:-

from collections import deque
a = [4, 5, 0, 0, 6, 7, 0, 1, 0, 5]
b = deque([x for x in a if x!=0])
for i in a:
    if i==0:
        b.appendleft(0)
print list(b)
#output -> [0, 0, 0, 0, 4, 5, 6, 7, 1, 5]
Narendra
  • 1,511
  • 1
  • 10
  • 20