Implementation of `last/2` in terms of `append/3` or `reverse/2`

Question

Why does the SWI-Prolog documentation suggest append(_, [Last], List) as a portable last/2 instead of say reverse(List, [Last|_]) (see here)? Is it that reverse/2 itself is not as widely implemented as append/3 is? Or is there something else I am missing from the picture?

Either way, all three do not terminate if the list is cyclic:

?- L = [z|L], last(L, Last).
^CAction (h for help) ? abort
% Execution Aborted
?- L = [z|L], append(_, [Last], L).
^CAction (h for help) ? abort
% Execution Aborted
?- L = [z|L], reverse(L, [Last|_]).
^CAction (h for help) ? abort
% Execution Aborted

But still, reverse/2 at least does not leave a choice point behind on proper lists:

?- append(_, [Last], [a]).
Last = a ;
false.

?- reverse([a], [Last|_]).
Last = a.

score 2 · Accepted Answer · edited May 23 '17 at 11:51

The definition of reverse/2 is in fact less common, and also SWI's implementation has better termination behavior whereas many other implementations only terminate if the first argument is a list. I see at least 3 different implementations: SWI on the one hand, SICStus and many other on the other, and then XSB which is a bit in the middle of both. You can distinguish them with the following goals:

reverse(Xs, [a]).     % terminates for SWI and XSB
reverse([a|Xs], [a]). % terminates for SWI

Performance-wise, I would expect that the traditional reverse/2 (not SWI's implementation) should be faster by a bit because it runs entirely deterministic. On the other hand, it recreates the entire list on the heap.

In current implementations, append(_, [L], Xs) is not ideally implemented: For each element of the list Xs, a choice point is created and then removed, leaving the last choice point active. For more, see this question.

Thank you for providing some perspective. As for efficiency, even the (4-argument) SWI version of `reverse` seems to run slightly faster than `append(_, [Last], List)`. Both are about 4 times slower than `last/2`. The traditional `reverse` is only about a factor of 1.5 slower than `last/2`. — , Mar 10 '15 at 14:25
@Boris: There is one additional performance problem in SWI: last call optimization is proportional to the number of arguments, and quite slow. See [this](http://stackoverflow.com/a/28667251/772868). — false, Mar 10 '15 at 14:33

CapelliC · Answer 2 · 2015-03-10T13:47:54.557

Actually, the system behaves exactly the inverse of what I was expecting:

13 ?- length(L,1000000),time(reverse(L,[X|_])),statistics.
% 1,000,002 inferences, 0.422 CPU in 0.424 seconds (100% CPU, 2367605 Lips)
% Started at Tue Mar 10 14:31:08 2015
% 523.563 seconds cpu time for 7,770,847 inferences
% 13,661 atoms, 4,540 functors, 3,987 predicates, 81 modules, 214,610 VM-codes
% 
%                        Limit    Allocated       In use
% Local  stack:    268,435,456       12,288        1,904 Bytes
% Global stack:    268,435,456  100,659,184   72,011,904 Bytes
% Trail  stack:    268,435,456      129,016        2,280 Bytes
% 
% 8 garbage collections gained 1,837,408 bytes in 1.346 seconds.
% Stack shifts: 13 local, 68 global, 47 trail in 0.034 seconds
% 2 threads, 0 finished threads used 0.000 seconds
L = [_G1238, _G1241, _G1244, _G1247, _G1250, _G1253, _G1256, _G1259, _G1262|...].

14 ?- length(L,1000000),time(append(_,[X],L)),statistics.
% 999,999 inferences, 0.572 CPU in 0.574 seconds (100% CPU, 1747727 Lips)
% Started at Tue Mar 10 14:31:08 2015
% 536.544 seconds cpu time for 8,772,339 inferences
% 13,662 atoms, 4,540 functors, 3,987 predicates, 81 modules, 214,615 VM-codes
% 
%                        Limit    Allocated       In use
% Local  stack:    268,435,456       12,288        2,960 Bytes
% Global stack:    268,435,456   50,327,536   48,011,920 Bytes
% Trail  stack:    268,435,456       30,712        2,312 Bytes
% 
% 8 garbage collections gained 1,837,408 bytes in 1.346 seconds.
% Stack shifts: 13 local, 72 global, 50 trail in 0.036 seconds
% 2 threads, 0 finished threads used 0.000 seconds
L = [_G1240, _G1243, _G1246, _G1249, _G1252, _G1255, _G1258, _G1261, _G1264|...] 
.

Seems that reverse/2 is using 2 times the allocation of append/3. The global and trail stacks usage is double for reverse/2, a consequence of reverse/2 being cleverly compiled to reverse/4....

`reverse/2` is actually defined in terms of `reverse/4`.... but how come `reverse` still shows a shorter wall time than append, despite using more memory? Or am I reading this wrong? (I can roughly reproduce your results with `time` and `statistics`) — , Mar 10 '15 at 13:52

Implementation of `last/2` in terms of `append/3` or `reverse/2`

2 Answers2

Linked