Non-termination of common reverse/2 implementation, and better solutions?

Question

The following is a standard textbook definition of reverse(X,Y) which is true if the list Y is the reverse of the list X. The code is often used to introduce or illustrate the use of an accumulator.

% recursive definition
step([], L2, L2).
step([H1|T1], X, L2) :- step(T1, X, [H1|L2]).

% convenience property around step/3
reverse(X, Y) :- step(X, Y, []).

The following query works as expcted.

?- reverse([1,2,3], Y).

Y = [3,2,1]

But the following fails after it prompts to search for more solutions after the first one.

?- reverse(X, [1,2,3]).

X = [3,2,1]
Stack limit (0.2Gb) exceeded
Stack sizes: local: 3Kb, global: 0.2Gb, trail: 0Kb
Stack depth: 4,463,497, last-call: 100%, Choice points: 12 
...

Questions:

What is the choice point prolog is going back to?
Is this called non-termination? I am not familiar with prolog terminology.
Is there a better way to define reverse(X,Y) such that it is reversible, in the sense that both of the above queries work and terminate?
I have found that using a cut step([], L2, L2):- !. appears to work, but this seems like we've waded into procedural programming and have drifted far away from declarative logic programming. Is this a fair judgement?

false · Answer 1 · 2022-10-07T05:02:45.090

1mo, frankly I do not know what kind of choicepoint is responsible. This is a notion far too low level to be of direct relevance. And there are better techniques to understand the problem, in particular failure slices.

2do, the problem here is called (universal) non-termination. But note how you found it: You got an answer and then only when demanding the next answer Prolog looped. This can be even worse, like looping only after the n-th answer. The easiest way to spot all kinds of non-termination is to just add false to the query. If G_0 terminates universally also G_0, false terminates (and fails).

3tio, yes there is. But first, try to understand why your original program looped. The best is to add some falsework into your program. By adding goals false we obtain a failure-slice. And if we find such a slice that already does not terminate then also the original program does not terminate. (No further analysis required!¹) Here is the one of relevance:

step([], L2, L2) :- false.
step([H1|T1], X, L2) :- step(T1, X, [H1|L2]), false.

reverse(X, Y) :- step(X, Y, []), false.

?- reverse(X, [1,2,3]), false.
   loops.

So we need to understand only that visible part! As promised, there is now not a single choicepoint present.

Just look at the head of step/3! There, only the first argument insists on some specific term, but the second and third do not insist on anything. Therefore the second and third argument cannot influence termination. They are termination neutral. And thus, only the first argument of reverse/2 will influence termination.

To fix this, we need to somehow get the second argument of reverse/2 into a relevant position in step. The simplest way is to add another argument. And, if we are at it, we may realize that both arguments of reverse/2 are of the same length, thus:

step([], L2, L2, []).
step([H1|T1], X, L2, [_|Y]) :- step(T1, X, [H1|L2], Y).

reverse(X, Y) :- step(X, Y, [], Y).

?- reverse(X, [1,2,3]), false.
   false.
?- reverse([1,2,3], Y), false.
   false.
?- reverse(X,Y).
   X = [], Y = []
;  X = [_A], Y = [_A]
;  X = [_A,_B], Y = [_B,_A]
;  X = [_A,_B,_C], Y = [_C,_B,_A]
;  ... .

4to, don't believe the tale of the green cut! They are so rare. Most good cuts are placed together with a guard that ensures that the cut is safe. See how your cut wreaked havoc:

?- X = [a], reverse(X,Y).
   X = "a", Y = "a".   % nice
?-          reverse(X,Y), X = [a].
   false, unexpected.
?- reverse(L,[]).
   L = [].
?- L = [_|_], reverse(L,[]).
   loops, unexpected.

So sometimes the program will fail incorrectly, and the looping is still present. Hardly an improvement.

^{1 Assuming that we use the pure monotonic subset of Prolog}

recursing on both lists to ensure that they are the same length is very clever! It looks like you sort of combined the original `step/3` predicate with `same_length/2` into this `step/4`. Does that seem right? — jweightman, Oct 07 '22 at 06:13
@jweightman, yes, except that in many cases (not this one), there is no way indicate something corresponding to `same_length/2`. So in that sense, my approach is "more general". It also works for (some cases of) left recursion in grammars. — false, Oct 07 '22 at 06:47

jweightman · Accepted Answer · 2022-10-07T04:56:38.473

2

Yes, you have correctly noted that this predicate does not terminate when you pass a variable in the first argument. It also does not terminate if the first argument is a partial list.

The first witness that you reported comes from the fact step([], L2, L2)., which is clearly the base case for your recursion/induction. When you ask the Prolog engine for additional witnesses, it proceeds by trying to do so using the induction rule step([H1|T1], X, L2) :- step(T1, X, [H1|L2]). Note that your implementation here is defined recursively on the first argument, and so this unifies the unbound first argument with [H1|T1], and then makes a recursive call with T1 as the first argument, which then unifies with a fresh [H1|T1], which makes a recursive call... This is the cause of the infinite loop you're observing.
Yes.
Often times with nontermination issues, it's helpful to understand Prolog's execution model. That doesn't necessarily mean we can't come up with a "pure logic" solution, though. In this case, the query doesn't terminate if the first argument is a partial list, so we simply need to ensure that the first argument has a fixed length. What should its length be? Well, since we're reversing a list it should be the same as the other list. Try out this definition instead:

reverse(X, Y) :- same_length(X, Y), step(X, Y, []).

This solves the problem for both of the queries you posed. As an added bonus, it's actually possible to pose the "most general query" and get a sensible infinite sequence of results with this definition:

?- reverse(X, Y).
X = Y, Y = [] ;
X = Y, Y = [_] ;
X = [_A, _B],
Y = [_B, _A] ;
X = [_A, _B, _C],
Y = [_C, _B, _A] ;
X = [_A, _B, _C, _D],
Y = [_D, _C, _B, _A] ;
...

As far as I know, there isn't really a clear way to describe Prolog's cut operator in the language of first order logic. All of the literature I've read on the topic describe it operationally within the context of Prolog's execution model — by this I mean that its semantics are defined in terms of choice points/backtracking rather than propositions and logical connectives. That being said, it's hard to write Prolog code that is fast or has good termination properties without being aware of the execution model, and "practical" Prolog programs often use it for this reason (look up "Prolog red and green cuts"). I think your judgement that the cut is "procedural" is on the right track, but personally I think it's still a valuable tool when used appropriately.

edited Oct 07 '22 at 04:56

answered Oct 07 '22 at 03:00

jweightman

328
1
12

Your suggested definition does not terminate for `reverse([],[_|_])`. – false Oct 07 '22 at 03:41
Thanks for catching that! Using swi-prolog's `same_length/2` predicate seems to solve that issue, and I've edited my answer accordingly. Clearly `length([_|_], L)` has infinitely many solutions :) – jweightman Oct 07 '22 at 04:58
This is a much nicer solution! There are still goals that do not terminate although there are no solutions. Can you spot them? (Not that fixing these is really required.) – false Oct 07 '22 at 05:17
1

You have a very keen eye. Off the top of my head, `reverse(X, [_|X])` might be one such goal. I see that your solution doesn't terminate, whereas mine terminates by blowing the stack (in swi-prolog 8.4.3) :P – jweightman Oct 07 '22 at 05:41
We both do not cover this case. Frankly, I have no clean solution for this. Maybe something along [`append2u/3`](https://stackoverflow.com/a/65361752/772868). But in any case: non-termination is much much better than incorrect answers! It's just an engine runaway. – false Oct 07 '22 at 05:54
Maybe it is simpler to just concentrate on a pure implementation that terminates (and fails) for `same_length(L,[_|L])` and other such cases. – false Oct 07 '22 at 06:02
This alternative definition guards the recursive case with an occurs check, which seems to do the trick. `contains_var` is technically extra-logical, so there's probably some case where this doesn't work. Let me know if you see any problems with it: `same_length([], []). same_length([_|T1], [_|T2]) :- \+ contains_var(T1, T2), \+ contains_var(T2, T1), same_length(T1, T2).` – jweightman Oct 07 '22 at 06:41
Counterexample: `same_length([_|E],[_,E])` should succeed – false Oct 07 '22 at 06:45
Good catch, a term contains itself. Change the base case to `same_length(L, L).`, but that's slow. Maybe T1 = T2 can be incorporated into the guard somehow? – jweightman Oct 07 '22 at 06:47
In `same_length([_|E],[_,E])` no term contains itself, the two lists are different. The solution is just `E = [_]` – false Oct 07 '22 at 06:48
[See this](https://stackoverflow.com/q/73983520/772868) – false Oct 07 '22 at 07:02

score 0 · Answer 3 · answered Oct 07 '22 at 10:24

0

swi-prolog added an extra argument to fix such termination:

?- reverse(L, [1,2,3]).
L = [3,2,1].

answered Oct 07 '22 at 10:24

brebs

3,462
2
3
12

Non-termination of common reverse/2 implementation, and better solutions?

3 Answers3