When you optimize anything, make sure to measure first! (most of us tend to forget this....)
When you optimize Prolog, look out for the following:
- Tail recursion tends to do better (so there goes your "before or after series of statements" question);
- Avoid creating choice points you don't need (this depends on the Prolog implementation)
- Use an optimal algorithm (as in, don't traverse a list twice if you don't have to).
A solution that is "optimized" for a more or less standard Prolog implementation will look maybe a bit different. I will name it list_uniq
(in analogy to the command-line uniq
tool):
list_uniq([], []). % Base case
list_uniq([H|T], U) :-
list_uniq_1(T, H, U). % Helper predicate
list_uniq_1([], X, [X]).
list_uniq_1([H|T], X, U) :-
( H == X
-> list_uniq_1(T, X, U)
; [X|U1] = U,
list_uniq_1(T, H, U1)
).
It is different from the reduce0/2
by @CapelliC because it uses lagging to avoid the inherent non-determinism of [X|Xs]
and [X,X|Xs]
in the first argument.
Now to the claim that it is "optimized":
- It traverses the list exactly once (no need for reversing)
- It it tail-recursive
- It does not create and discard choice points
You will get the same 12 inferences as @CapelliC, and if you then use a somewhat longer list, you will start to see differences:
?- length(As, 100000), maplist(=(a), As),
length(Bs, 100000), maplist(=(b), Bs),
length(Cs, 100000), maplist(=(c), Cs),
append([As, Bs, Cs, As, Cs, Bs], L),
time(list_uniq(L, U)).
% 600,006 inferences, 0.057 CPU in 0.057 seconds (100% CPU, 10499893 Lips)
As = [a, a, a, a, a, a, a, a, a|...],
Bs = [b, b, b, b, b, b, b, b, b|...],
Cs = [c, c, c, c, c, c, c, c, c|...],
L = [a, a, a, a, a, a, a, a, a|...],
U = [a, b, c, a, c, b].
The same query with reduce0
, reduce1
, reduce2
from @CapelliC's answer:
% reduce0(L, U)
% 600,001 inferences, 0.125 CPU in 0.125 seconds (100% CPU, 4813955 Lips)
% reduce1(L, U)
% 1,200,012 inferences, 0.393 CPU in 0.394 seconds (100% CPU, 3050034 Lips)
% reduce2(L, U)
% 2,400,004 inferences, 0.859 CPU in 0.861 seconds (100% CPU, 2792792 Lips)
So, creating and discarding choice points with cuts (!
) has a price, too.
However, list_uniq/2
, as it stands, can be wrong for queries where the first argument is not ground:
?- list_uniq([a,B], [a,b]).
B = b. % OK
?- list_uniq([a,A], [a]).
false. % WRONG!
reduce0/2
and reduce1/2
can be wrong, too:
?- reduce0([a,B], [a,b]).
false.
?- reduce1([a,B], [a,b]).
false.
As for reduce2/2
, I am not sure about this one:
?- reduce2([a,A], [a,a]).
A = a.
Instead, using the definition of if_/3
from this answer:
list_uniq_d([], []). % Base case
list_uniq_d([H|T], U) :-
list_uniq_d_1(T, H, U). % Helper predicate
list_uniq_d_1([], X, [X]).
list_uniq_d_1([H|T], X, U) :-
if_(H = X,
list_uniq_d_1(T, H, U),
( [X|U1] = U,
list_uniq_d_1(T, H, U1)
)
).
With it:
?- list_uniq_d([a,a,a,b], U).
U = [a, b].
?- list_uniq_d([a,a,a,b,b], U).
U = [a, b].
?- list_uniq_d([a,A], U).
A = a,
U = [a] ;
U = [a, A],
dif(A, a).
?- list_uniq_d([a,A], [a]).
A = a ;
false. % Dangling choice point
?- list_uniq_d([a,A], [a,a]).
false.
?- list_uniq_d([a,B], [a,b]).
B = b.
?- list_uniq_d([a,A], [a,a]).
false.
It takes longer, but the predicate seems to be correct.
With the same query as used for the other timings:
% 3,000,007 inferences, 1.140 CPU in 1.141 seconds (100% CPU, 2631644 Lips)