2

I am new to Prolog and am not that great when it comes to recursive algorithms as is, thus I am confused with the following two clauses:

size([], 0).
size([H|T], N) :- size(T, N1), N is N1+1.

I am having trouble tracing this problem for:

?- size([a,b,c,d], N).

This will unify with the second clause to form:

size([a,b,c,d], N) :- size([b,c,d], N1), N is N1+1.

But I am confused with the N is N1+1 as these variables are never unified. What values do these variables take?

Any help regarding this question, or a simple trace of this algorithm, would be greatly appreciated.

repeat
  • 18,496
  • 4
  • 54
  • 166
JmanxC
  • 377
  • 2
  • 16
  • `N` **is** definitely unified by the time the `is` is evaluated, because `,` is executed left to right, and the recursive call binds the `N` - ultimately by unifying (a local version of) it with `0`. – Kilian Foth May 09 '16 at 21:32
  • Thank you for the quick response. I am not sure if I understand what you mean. Where in the trace process N unify with 0? – JmanxC May 09 '16 at 22:30
  • I correct myself: `N1` is unified first, either with `0` or with the lower-level `N` from the clause `size([H|T],N)`, and then `N1` is used to compute the top-level `N`. The trick is to understand that in a recursive algorithm, *there are several different versions of `N`, each with a different value*. In your example, these different `N`s would be bound to 0, 1, 2, 3, and 4. – Kilian Foth May 10 '16 at 06:09

1 Answers1

4

the N is N1+1 as these variables are never unified.

I think you mean that they are never instantiated i.e. they never get a value. Unifying two variables can instantiated them but it's not necessary. For example you could run this in a prolog REPL:

?- N = N1.
N = N1.

while N and N1 have no value (yet), they are unified; if N1 gets instantiated later, N will be instantiated with the same value as well. Another, less trivial, example:

?- [H|T] = L, L = [1|M], writeln(H).
1
H = 1,
T = M,
L = [1|M].

However, it is true that N is never unified with N1+1! is will evaluate N1+1 and that value will be unified with N. This will happen after the inner size([b,c,d],N1) has been evaluated so the N1 variable will have been instantiated.

Essentially, the execution will be like this:

size([a,b,c,d],N).
          size([b,c,d],N1)
               size([c,d],N1)
                   size([d],N1)
                        size([],0)
                        N is 0+1.
                   N is 1+1.
               N is 2+1.
          N is 3+1.

Which is a bit inefficient as we need to keep all the function calls in memory; look into tail-recursion and accumulators to fix this.

Also note that N1 needs to be instantiated only because is is going to evaluate the expression. You could write something like this:

size([],0).
size([_|T], 1+ N1):-size(T,N1).

and if you call it:

?- size([1,2],N).
N = 0+1+1.

Fun, isn't it? You might want to evaluate the last N, e.g. call it as size([1,2],N), Result is N. However, one problem is that we keep a chain of 0+1+1+1+1+...... in memory which can get big very fast. So it's best to use this trick for things that are not going to grow, e.g. expression templates.

Thanos Tintinidis
  • 5,828
  • 1
  • 20
  • 31
  • 1
    Actually, the latter variant **is** tail-recursive! Efficient? No, but interesting nonetheless... – repeat May 10 '16 at 11:19
  • 1
    @repeat yep, I think it's a nice way to show the difference between a tail-recursive version and the regular one without having to explain the accumulator; and then the version with the accumulator can be presented as a way to optimise the arithmetic operations – Thanos Tintinidis May 10 '16 at 11:46
  • 2
    On second thought, you might want to adapt the expression from left-leaning to right-leaning by changing the second clause of `size/2` to `size([_|Es], 1+N) :- size(Es, N).` – repeat May 10 '16 at 11:46
  • 1
    Thx for the idea! At least with SWI the difference between left- and right-leaning is substantial. Of course that depends on the internal implementation of `(is)/2`... – repeat May 10 '16 at 11:49
  • 2
    @repeat very interesting, `N+1` takes almost a second with 5m elements while `1+N` is instant. It also runs out of stack at 6m elements versus 11m (I'm also using SWI prolog). – Thanos Tintinidis May 10 '16 at 12:09
  • Running out of global stack in `size/2` is to be expected... but does `(is)/2` ever run out of memory? – repeat May 10 '16 at 12:12
  • 1
    Right-leaning trees are a lot more common in Prolog than their left-leaning cousins... – repeat May 10 '16 at 12:15
  • 1
    With 5M entries I observed ~300ms with `1+N` and ~1s with `N+1`, IIRC. – repeat May 10 '16 at 12:17
  • @repeat you mean the non-tail recursive version with `is`? interestingly enough yes, at 2m elements (makes sense since it takes more memory to store the frame) but there's hardly any difference between `N+1` and `1+N` – Thanos Tintinidis May 10 '16 at 12:27
  • 1
    No, I was testing the two tail-recursive variants with a single `(is)/2` goal. – repeat May 10 '16 at 12:31
  • 1
    @repeat Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/111532/discussion-between-thanos-tintinidis-and-repeat). – Thanos Tintinidis May 10 '16 at 12:32