Correct way to terminate common list length len/2 relation

Question

The following is a common definition for len/2 which relates a list to its length, found in many introductory guides and textbooks.

% length of a list
len([], 0).
len([_H|T], L) :- len(T, M), L is M+1.

This works well for queries where the first parameter is provided as a sufficiently instantiated list, and the second parameter is an unbound variable. For example, the following asks "how long is the list [a,b,c]?"

?- len([a,b,c], L).

L=3

This query terminates, as expected.

However, in the opposite direction, we can ask "which list has length 3?". This is done by providing the first parameter as an unbound variable, and the second parameter with a value 3.

?- len(X, 3).

X = [_1688, _1694, _1700]

.. infinite loop ..

Here prolog provides the expected answer, a list containing 3 items. The 3 items are unbound variables, which is logically correct because they can take any value and the len/2 relation will still be true.

However - prolog backtracks and tries to find further solutions. This leads it to try ever longer lists, a logically infinite loop which will eventually crash a finite computer.

Question: What is the correct way to adjust the prolog program defining len/2 to avoid this non-termination?

I would like answers that explain why tactical procedural approaches, such as using cuts, is not as good as a program that applies a further logical constraint, perhaps suggesting the current program is logically correct but not complete.

UPDATE:

The following use of a cut seems to be a simple, but procedural, way to achieve the desired bi-directionality with termination.

% length of a list
len([], 0).
len([_H|T], L) :- len(T, M), L is M+1.

len2(X,Y) :- len(X,Y), !.

Test 1:

?- len2([a,b,c],L).

L =3
(terminates)

Test 2:

?- len2(X,3).

X = [_1598, _1604, _1610]
(terminates)

I'm still learning prolog so I'd appreciate, as per the original question, if and how this is logically impure. The above tests suggests it is "pure" in the sense it terminates and works bidirectionally.

len2(X,Y) only gives one set of answers X=[], Y=0. This terminates but is technically incorrect. — Penelope, Oct 26 '22 at 22:47
Also relevant: https://stackoverflow.com/questions/75502875/ — brebs, Feb 21 '23 at 12:45

brebs · Answer 1 · 2023-02-21T12:43:59.717

I think this is a reasonable solution, to satisfy the usual requirements of:

Not spiralling off into infinity unexpectedly
Reasonable performance, elegance, determinism, readability and code-reusability

list_length(Lst, Len) :-
    (   nonvar(Len)
    ->  integer(Len),
        Len @>= 0
    ;   true
    ),
    once_only_if(
        (is_list(Lst) ; nonvar(Len)),
        list_length_(Lst, 0, Len)
    ).

list_length_([], Len, Len).
list_length_([_|T], Upto, Len) :-
    Upto1 is Upto + 1,
    list_length_(T, Upto1, Len).

% Generic and reusable predicates are below
once_only_if(OnceIf, Goal) :-
    call_t(OnceIf, Bool),
    t_once(Bool, Goal).

call_t(Goal, Bool) :-
    % Don't use *-> because Goal might contain ";" alternatives
    (   call(Goal)
    ->  Bool = true
    ;   Bool = false
    ).

t_once(true, Goal) :-
    call(Goal),
    % Don't backtrack into Goal
    !.
t_once(false, Goal) :-
    call(Goal).

Results in swi-prolog:

?- list_length(L, -1).
false.

?- list_length(L, 1.5).
false.

?- list_length([a,b,c], N).
N = 3.

?- list_length(L, 3).
L = [_, _, _].

Recursion (unless TRO, tail-recursive optimization, i.e. a loop) is to be avoided unless the programmer really couldn't be bothered, because recursion is slower than non-recursion in any language (a very rare exception is permute).

Since length/2 is used so commonly in Prolog, it is performance-optimized to the extreme - e.g. in swi-prolog and in Scryer.

A better example for avoiding non-termination whilst also avoiding cuts is https://stackoverflow.com/a/74130437/

Performance comparison with:

len([], 0).
len([_H|T], Len) :-
    len(T, Len0),
    Len is Len0 + 1.

len2(Lst, Len) :-
    len(Lst, Len), !.

?- garbage_collect, length(Lst, 5_000_000), time(list_length(Lst, Len)).
% 5,000,006 inferences, 0.096 CPU in 0.096 seconds (100% CPU, 52325193 Lips)

?- garbage_collect, length(Lst, 5_000_000), time(len2(Lst, Len)).
% 10,000,002 inferences, 2.495 CPU in 2.500 seconds (100% CPU, 4008165 Lips)

hi @brebs I'd welcome your thoughts on the "UPDATE" added to the question. — Prolog ByExample, Oct 26 '22 at 22:42
thanks @brebs I appreciate the performance comparison. You seem to be more experienced with prolog. Do you have any comments on the "logical purity" of the cut solution? It is short, but I'm told cuts are bad for logical consistency. — Penelope, Oct 27 '22 at 14:36
Cuts are good when used correctly, and bad when used incorrectly. Their subtle effects take a while to learn. Not all parts of a program are, or need to be, pure. — brebs, Oct 27 '22 at 17:23

Correct way to terminate common list length len/2 relation

1 Answers1