2

I have a non-tail recursion in place( ). I have two "correct" answers, one of which I don't think is a correct answer, but I also noticed in my admittedly copious debugging print statements that the program keeps calculating as it exits the recursion, despite having found the answer. Is there a way to interrupt this or are these two parts of the same problem?

 def partialDigest(l):
    width = max(l)
    l.remove(width)
    x = [0, width]
    place(l, x, width, level=0)

def place(l, x, width, level):
    level = level + 1
    print('level',level)
    if not l:
        print('done!', x)
        return
    y = max(l)
    print('y',y)
    print('a', [abs(i-y) for i in x], l)
    if subset([abs(i-y) for i in x], l):
        print('choose a')
        print('x',sorted(x))
        print('l',sorted(l))
        print('append y to x and remove delta(x) from l')
        remove_lengths([abs(i-y) for i in x], l)
        x.append(y)
        print('x',sorted(x))
        print('l',sorted(l))
        print ('run place')
        place(l, x, width, level)
        print('remove y from x and add lengths to l')
        add_lengths([abs(i-y) for i in x], l)
        x.remove(y)
    print('b', [abs(i-(width-y)) for i in x], l)
    if subset([abs(i-(width-y)) for i in x], l):
        print('choose b')
        print('x',sorted(x))
        print('l',sorted(l))
        print('append (width - y) to x and remove lengths from l')
        remove_lengths([abs(i-(width - y)) for i in x], l)
        x.append(width - y)
        print('x',sorted(x))
        print('l',sorted(l))
        print('run place')
        place(l, x, width, level)
        print('remove (width - y) from x and add lengths to l')
        add_lengths([abs(i-(width-y)) for i in x], l)
        x.remove(width - y)
    else:
        print('c')
    print x, l
    print('return to level', level)
    return x, l

def remove_lengths(x, l):
    for i in x:
        if i in l:
            l.remove(i)
    return l

def add_lengths(x, l):
    for i in x:
        if i != 0:
            l.append(i)
    return l

def subset(list1, list2):
    x = True
    for i in list1:
        if i not in list2:
            x = False
    return x

l = [2, 2, 3, 3, 4, 5, 6, 7, 8, 10]
print(partialDigest(l))

For those interested, this is an attempt to implement Steven Skiena's solution to the partial digest problem for restriction fragment mapping. Antonio Perez has a generally better, conceptually similar solution over on GitHub, but his implementation seems like it would have the same issue.

Niels
  • 1,513
  • 1
  • 14
  • 21
  • This is irrelevant bikeshedding, but: – Danver Braganza Jun 27 '15 at 23:39
  • Nothing is irrelevant! I'm entirely self-taught. Tell me. Please! – Niels Jun 27 '15 at 23:44
  • 2
    Your method subset(l1, l2) could have its runtime improved a lot if you used a set for lookup instead of a list. Whereas looking up if something is a member in a list is an O(n) operation, a similar lookup in a set is just a O(1) operation. So you could convert l2 into a set first, with set2 = set(list2), and then continue the rest of the algorithm as is, but using set2 istead of list2. – Danver Braganza Jun 27 '15 at 23:50
  • 1
    The same holds true for remove_lengths, where you're testing for membership in l. – Danver Braganza Jun 27 '15 at 23:51
  • Thanks! I initially wrote this up with sets, but l is a multiset: it can have the same value multiple times, and stepping through is supposed to delete by index, as I understand it. Working from Jones and Pevzner's Intro to Bioinformatics, but it does seem this multiset concept is a real thing. It looks like python supports a thing called a "counter" that takes the place of multisets? – Niels Jun 27 '15 at 23:59
  • 2
    Yeah, that could work. I have a feeling you could implement your add_lengths and remove_lengths in terms of update() and subtract(). Here's the documentation for Counter, as I'm sure you're aware: https://docs.python.org/2/library/collections.html#collections.Counter Also `i not in counter2` should be fast now. The only thing you need to be careful for is if i does exist in counter2, but the count is 0, `i in counter2` is going to be `True`. You can test this elegantly with `not counter2.get(i, 0)`, which returns 0 if `i` is missing, and then `not 0` evaluates to True. – Danver Braganza Jun 28 '15 at 00:14

2 Answers2

4

The part of your code which is presumably the recursion termination

if not l:
    print('done!', x)
    return

is doing pretty much the same as returning None, whereas it returns non-None stuff in other cases.

So, when you do the recursive calls, just check the return value for this, and return without continuing the redundant calculations if this is the case. That is, instead of

foo(...) # I'm calling foo
# and continuing unconditionally,

use

if foo(...) is None: # If someone below me reached the termination
    return # Then I will propagate the termination having been reached, and won't continue
Community
  • 1
  • 1
Ami Tavory
  • 74,578
  • 11
  • 141
  • 185
  • Thanks, though I'm not sure I understand. Wouldn't that recompute the previous child? – Niels Jun 27 '15 at 23:49
  • 2
    @Niels See clarification. My suggestion isn't *in addition* to the function call line you have, but *instead of it*. – Ami Tavory Jun 27 '15 at 23:51
  • I see, thanks! Now I have to figure out why it was hitting "done" twice, and now the only "done" it hits is, regrettably, the wrong "done". – Niels Jun 28 '15 at 00:00
1

Here's some bad software engineering, but good-enough bio-informatics. It's worse that Ami's answer, but might be easier for you to implement.

You could raise an Exception to signal that computation is completed. At the start, before your function, you write:

class Done(Exception):
    """Completed recursion and found answer."""
    # No body required

So as soon as you know you have the answer, you go

if Answer found:
    #  print it
    raise Done

Then when you call the whole function, it will stop all of the levels of recursion as soon as the first answer is found. It will ALSO quit your whole program, unless you call the function with a try block,

try:
    place(l, x, width, level=0)
except Done:
    # We expected this, nothing to do
    pass

Once again, this is bad software engineering: purists might get angry with you! However, it's easy get right in more complicated recursions without having to plumb all the exit cases correctly.

Danver Braganza
  • 1,295
  • 10
  • 10
  • Thanks! I have to admit, I have yet to write my own classes, so I'll give it a go. I'm working on it. On a vaguely related note, for my test case (l=[2,2,3,3,4,5,6,7,8,10]), the book says I should get [0, 2, 4, 7, 10] as the answer, but I'm getting [0, 3, 6, 8, 10], which, on further reflection, is probably equivalent (the internal distances are the same). Is there a way to capture all possible answers and still break out without letting the final recursion trickle away? I suppose that would come down to exit tests that are for me to think of and write? – Niels Jun 28 '15 at 00:14
  • 1
    Well, if you knew that there were only going to be 3 answers (for instance), you could break on the last one. For some problems, determining how many answers there are going to be is computationally as hard as actually solving for them, though, so you shouldn't feel too worried that the algorithm carries on doing extra work. If you really need more than one solution, I'd suggest just continuing until all the solutions are printed out. If you assume that solutions are randomly distributed in the space, on average you're doing a constant factor of extra work, which isn't terrible. – Danver Braganza Jun 28 '15 at 00:22