Python: Sorted merge using list comprehension

Question

I'm trying to merge two sorted arrays into a sorted array. I know how to do that in general (so the question itself isn't about merging lists), but for learning purposes I want to write it with a list comprehension. Ideally, I'd write this:

i=0; j=0
[a[i++] if j >= len(b) or a[i] < b[j] else b[j++]
     for tmp in range (len(a)+len(b))]

(I know a has larger last element, so the above wouldn't go out of bound). However, python doesn't have an ++ operator. I'd try to write it myself:

def inc (x): x += 1; return x-1

But this doesn't work since python doesn't pass by reference. I'd use iterators if I could look at the top element without advancing it, but I can't.

So, is there a way to write it elegantly in one statement, instead of like this:

i=0; j=0
while i<len(a) or j<len(b):
    if j >= len(b) or a[i]<b[j]: res.append(a[i]); i += 1
    else res.append(b[j]); j += 1

Possible duplicate of [Combining two sorted lists in Python](http://stackoverflow.com/questions/464342/combining-two-sorted-lists-in-python) — Matthew, Mar 17 '16 at 06:27
As stated in the second answer to [this question](http://stackoverflow.com/questions/464342/combining-two-sorted-lists-in-python), use the merge function from the heapq module. This already does what you are trying to accomplish. That question has several other ways to do it as well. — Matthew, Mar 17 '16 at 06:29
Please don't write multi line statements on one line or use semi-colons in Python. Whoever has to read your code afterwards will thank you. — kylieCatt, Mar 17 '16 at 06:37
Have you tried `sorted(a + b)`. Unless your `a` or `b` are very large the fact that it is O(n log n) may not matter. eg for 1000000 items, `sorted` takes 82ms vs 329ms for `heapq.merge` on this computer. — John La Rooy, Mar 17 '16 at 06:39
Before you make it 'elegant' you should make it work. Your while loop is buggy - and poorly formatted. — hpaulj, Mar 17 '16 at 07:06

hpaulj · Answer 1 · 2016-03-17T07:57:23.700

This is what I'd call elegant:

a = [0,1,2]
b = [1,2,3]
i, j = 0, 0
result = []
for temp in range(len(a)+len(b)):
    if j>=len(b) or a[i]<b[j]:
        result.append(a[i])
        i += 1
    else:
        result.append(b[j])
        j += 1
    print(i,j,result)
print(result)

especially since it gives a diagnostic print

0011:~/mypy$ python stack36052998.py 
(1, 0, [0])
(1, 1, [0, 1])
(2, 1, [0, 1, 1])
(2, 2, [0, 1, 1, 2])
(3, 2, [0, 1, 1, 2, 2])
Traceback (most recent call last):
  File "stack36052998.py", line 31, in <module>
    if j>=len(b) or a[i]<b[j]:
IndexError: list index out of range

Trying to turn a list comprehension into something it was not designed for is not elegant.

List pop is an alternative to the ++ index incrementing:

def foo(a,b):
    result = []
    while a and b:
        x = a.pop(0) if a[0]<b[0] else b.pop(0)
        result.append(x)
    return result

a = [0,1,3,5,6]
b = [1.1,2.1,4.1,9.1]
print(foo(a[:],b[:]))

produces:

[0, 1, 1.1, 2.1, 3, 4.1, 5, 6]

It has the same problem I had before, that it does not collect trailing values from b.

I could also run it in a list comprehension, if I knew exactly how many iterations I needed.

result = [a.pop(0) if a[0]<b[0] else b.pop(0) for _ in range(8)]

Passing values by reference requires mutable values. Integers aren't mutable, lists are. Rather than use indexes disguised as lists, just use the lists themselves.

Correct loop:

def foo(a,b):
    result = []
    while a or b:
        if not a:
            result.extend(b)
            break
        if not b:
            result.extend(a)
            break
        x = a.pop(0) if a[0]<b[0] else b.pop(0)
        result.append(x)
    return result

Isn't it going to take square time with the list pop() method, though? — Todor Markov, Mar 17 '16 at 07:51
A thing I like about `pop`, and iteration over a list, is that I don't have to keep checking `len`. I just ask is the list empty. It's easier to write correct code. — hpaulj, Mar 17 '16 at 08:22
I may be able to replace the `pop` with an iterator. `iter(a)` and use `next`. — hpaulj, Mar 17 '16 at 08:25
You can reverse both lists and pop from the back. It'll take linear time. — Андрей Беньковский, Mar 17 '16 at 08:46
How about building the list in reverse (e.g. largest results first), and then flipping it before returning? `result = []; while a or b: result.append(a.pop() if a and (not b or a[0] > b[0]) else b.pop(); return result[::-1]`. This is still `O(N)`, whereas anything with `pop(0)` is going to be quadratic. — Blckknght, Mar 17 '16 at 08:46

Blckknght · Accepted Answer · 2016-03-17T20:53:18.757

I think the only even vaguely Pythonic way to make a list comprehension work for merging two lists is to write a custom iterator type that lets you peek on the next element that will be yielded. Here's a quick version I've thrown together:

class PeekIterator(object):
    def __init__(self, iterable, sentinel=None):
        self.iterator = iter(iterable)
        self.sentinel = sentinel
        self.next_val = self.sentinel

    def peek(self):
        if self.next_val is self.sentinel:
            try:
                self.next_val = next(self.iterator)
            except StopIteration:
                pass # the sentinel will be returned if there's nothing left in the iterator
        return self.next_val

    def __iter__(self):
        return self

    def __next__(self):
        if self.next_val is self.sentinel:
           return next(self.iterator) # StopIteration is deliberately allowed to bubble out!
        val = self.next_val
        self.next_val = self.sentinel
        return val

    next = __next__ # Python 2 compatibility

Now the list comprehension becomes:

a_peek = PeekIterator(a)
b_peek = PeekIterator(b)
merged = [next(a_peek) if a_peek.peek() is not None and
                          (b_peek.peek() is None or a_peek.peek() <= b_peek.peek())
          else next(b_peek)
          for _ in range(len(a) + len(b))]

If you know of an upper bound for values in your lists (e.g. float("inf") perhaps, for numbers), you can avoid the extra peek calls and short-circuiting logic by creating the iterators with the bound as their sentinel values:

upper_bound = float("inf") # any value that compares larger than all the values in both of your lists
peek_a = PeekIterator(a, upper_bound)
peek_b = PeekIterator(b, upper_bound)
merged = [next(peek_a) if peek_a.peek() <= peek_b.peek() else next(peek_b)
          for _ in range(len(a)+len(b))]

I think a non-comprehension version is going to be easier to understand.

I explored the iterator route, and hit this `peek` issue. Without this extra method I had to maintain a separate `current value` variable. — hpaulj, Mar 17 '16 at 20:50

score 0 · Answer 3 · edited May 23 '17 at 11:50

0

Check this answer Python passing an integer by reference

[a[i++] if j >= len(b) or a[i] < b[j] else b[j++]

Assuming i and j are one dimension list with one element, as in the link above.

def increment(var):
    var[0] += 1
    return var[0]

with the above function the code can be transformed to

[a[increment(i)] if j >= len(b) or a[i] < b[j] else b[increment(j)]

UPDATE

>>> def increment(number):
...    number[0] += 1
...    return number[0]
...
>>> num = [0]
>>> print [increment(num) for dummy in range(10)]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

edited May 23 '17 at 11:50

Community

1
1

answered Mar 17 '16 at 06:47

saikumarm

1,565
1
15
30

You also need to change how `i` and `j` are initialized, and how they're used by the other parts of the list comprehension code. `a[i]` doesn't work if `i` is a one-element list. – Blckknght Mar 17 '16 at 08:58

score 0 · Answer 4 · answered May 19 '20 at 06:33

I know this is a quite old question and you've probably moved on with your life;) But I came across the same problem and solved it with list comprehension + closures.

def merge(left, right):
    merge.lhs=0
    merge.rhs = 0
    def combine(left_side):
        if left_side:
            el = left[merge.lhs]
            merge.lhs += 1
        else:
            el = right[merge.rhs]
            merge.rhs += 1
        return el



    result =  [combine(True) if merge.rhs >= len(right) or (merge.lhs < len(left) and left[merge.lhs] < right[merge.rhs]) else combine(False) for i in range(len(left) + len(right))]

    return result

l = [1,3,5]
r = [2,4]
merge(l, r)

You could say that it's quite ugly, i agree, however, I'm not sure if, with LC, it actually is possible to merge two sorted lists otherwise!

score -2 · Answer 5 · answered Mar 17 '16 at 06:23

-2

i+1

j+1

If all you need to do is increment and that's the only issue you are having

answered Mar 17 '16 at 06:23

coding_carebear

308
3
9

This does not answer the question. – Matthew Mar 17 '16 at 06:31

Python: Sorted merge using list comprehension

5 Answers5