The easiest way to do what you describe is something like this. First, a public predicate, frequencies/2
. Breaking it down clause-by-clause:
- The empty list has the empty list as its set of frequency pairs:
frequencies( [] , [] ).
- For a non-empty list, we (1) sort the list, keeping duplicates —
sort/2
sorts the list and removes duplicates, make the list a set, whilst msort/2
does not remove duplicates — and (2) invoke the helper predicate freqs/4
passing it the tail of the source list and seeding its additional state with the head of the source list and an initial frequency count of 1
:
frequencies( [X|Xs] , Fs ) :- msort(Xs,X1), freqs(Xs,X,1,Fs) .
That gives us:
frequencies( [] , [] ) .
frequencies( [X|Xs] , Fs ) :- msort(Xs,X1), freqs(Xs,X,1,Fs) .
And then we need our helper predicate, freqs/4
. Clause-by-clause:
- If the source list is exhausted, our additional state (
Y
and N
) represent the current pair. Just move them to the result list and you're done:
freqs( [] , Y , N , [Y,N] ) .
- Otherwise, if the head of the source list matches the current element
Y
, add 1 to the frequence count N
and recurse down on the tail of the source list:
freqs( [X|Xs] , Y , N , Fs ) :- X = Y, N1 is N+1, freqs(Xs,Y,N1,Fs).
- Finally, if the head of the source list does not match the current element, move the current tuple
[Y,N]
to the result list, set the current element with an initial frequency of 1
, and recurse down:
freqs( [X|Xs] , Y , N , [[Y,N]|Fs] ) :- X \= Y, freqs(Xs,N,1,Fs).
Putting that all together, you get
freqs( [] , Y , N , [Y,N] ) .
freqs( [X|Xs] , Y , N , Fs ) :- X = Y, N1 is N+1, freqs(Xs,Y,N1,Fs).
freqs( [X|Xs] , Y , N , [[Y,N]|Fs] ) :- X \= Y, freqs(Xs,N,1,Fs).
And the whole thing, then is
frequencies( [] , [] ) .
frequencies( [X|Xs] , Fs ) :- msort(Xs,X1), freqs(Xs,X,1,Fs) .
freqs( [] , Y , N , [Y,N] ) .
freqs( [X|Xs] , Y , N , Fs ) :- X = Y, N1 is N+1, freqs(Xs,Y,N1,Fs).
freqs( [X|Xs] , Y , N , [[Y,N]|Fs] ) :- X \= Y, freqs(Xs,N,1,Fs).
We can make some improvements, though, to eliminate unnecessary backtracking: if the X = Y
test in the second clause of freqs/4
succeeds, backtracking into the 3rd clause will fail (because X = Y
and X \= Y
cannot both be true.) We can reorder clauses 2 and 3 and introduce a cut (!/0
) thus:
frequencies( [] , [] ) .
frequencies( [X|Xs] , Fs ) :- msort(Xs,X1), freqs(Xs,X,1,Fs) .
freqs( [] , Y , N , [Y,N] ) .
freqs( [X|Xs] , Y , N , [[Y,N]|Fs] ) :- X \= Y, !, freqs(Xs,N,1,Fs).
freqs( [X|Xs] , X , N , Fs ) :- N1 is N+1, freqs(Xs,Y,N1,Fs).