0

I'd like to compare two sorted lists element wise and handle each case differently:

  1. If both iterables contain the an element, I'd like to call update_func.
  2. If only the left iterable contains an element, I'd like to call left_surplus_func.
  3. If only the right iterable contains an element, I'd like to call right_surplus_func.

Unfortunately, zip doesn't help me in this case, because it would create tuples of unrelated elements. Also, in my case, both lists or iterables are of different and nonconvertible types.

This is similar to How can I compare two lists in python and return matches and Checking if any elements in one list are in another, but not close enough to be a real solution.

I came up with a working solution (except both iterables must not contain None):

def compare_iterables_elemet_wise(left, right, compare_func,
                          left_surplus_func, update_func, right_surplus_func):
    """
    :type left: collections.Iterable[U]
    :type right: collections.Iterable[T]
    :type compare_func: (U, T) -> int
    :type left_surplus_func: (U) -> None
    :type update_func: (U, T) -> None
    :type right_surplus_func: (T) -> None
    """
    while True:
        try:
            l = next(left)
        except StopIteration:
            l = None  # Evil hack, but acceptable for this example
        try:
            r = next(right)
        except StopIteration:
            r = None

        if l is None and r is not None:
            cmp_res = 1
        elif l is not None and r is None:
            cmp_res = -1
        elif l is None and r is None:
            return
        else:
            cmp_res = compare_func(l, r)

        if cmp_res == 0:
            update_func(l, r)
        elif cmp_res < 0:
            left_surplus_func(l)
            right = itertools.chain([r], right)  # aka right.unget(r)
        else:
            right_surplus_func(r)
            left = itertools.chain([l], left)  # aka left.unget(l)

Is there a more pythonic way to archive a similar result? I'm a little bit unhappy with my solution, as it depends on external side-effects of the functions. It would be nice to have a pure functional solution.

Edit: This is my test case:

creates = []
updates = []
deletes = []

def compare(a, obj):
    return cmp(int(a), obj)

def handle_left(a):
    creates.append(a)

def update(a, obj):
    updates.append((a, obj))

def handle_right(obj):
    deletes.append(obj)

left = list('12356')
right = [1, 3, 4, 6, 7]
compare_iterables_elemet_wise(iter(left), iter(right), compare, handle_left, update, handle_right)

assert creates == ['2', '5']
assert updates == [('1', 1), ('3', 3), ('6', 6)]
assert deletes == [4, 7]

I'd guess that I just need these three lists: creates, updates and deletes.

Edit2: set operations:

This is similar to my problem, except that the types of left and right differ:

left = [1, 2, 3, 5, 6]

right = [1, 3, 4, 6, 7]

In [10]: set(left) - set(right)
Out[10]: {2, 5}

In [11]: set(right) - set(left)
Out[11]: {4, 7}


In [14]: set(right).intersection(set(left))
Out[14]: {1, 3, 6}
Community
  • 1
  • 1
Sebastian Wagner
  • 2,308
  • 2
  • 25
  • 32

1 Answers1

1

You can simplify the try statements at the begining of your while loop by using the default argument of the next built-in function.

def compare_iterables_element_wise(left, right, compare_func,
                      left_surplus_func, update_func, right_surplus_func):

    l, r = next(left, None), next(right, None)
    while l or r:
        if l is None and r is not None:
            cmp_res = 1
        elif l is not None and r is None:
            cmp_res = -1
        else:
            cmp_res = compare_func(l, r)

        if cmp_res == 0:
            update_func(l, r)
            l, r = next(left, None), next(right, None)
        elif cmp_res < 0:
            left_surplus_func(l)
            l = next(left, None)
        else:
            right_surplus_func(r)
            r = next(right, None)

I would also suggest that compare_func handles the case where one of the argument is None so that you get rid of the first if.

From your second edit, the use of sets seems to be a good idea, providing you can cast your objects into a common type. You could have a __hash__ method to associate an integer with each object and do your operations based on the hash value.

Jacques Gaudin
  • 15,779
  • 10
  • 54
  • 75