Prolog find the element with the max number of occurrences

Question

I would like to create a predicate that returns the element that most often appears, if there are more than one with the same number of occurrences the first:

occ([a,b,c,a,a,a,b],M).
yes M = a
occ([a,b,c,a,b],M).
yes M = a

What is your own attempt? – Willem Van Onsem Sep 09 '17 at 20:38 — Willem Van Onsem, Sep 09 '17 at 20:38

score 2 · Accepted Answer · answered Sep 09 '17 at 21:47

Note that in Prolog you would generally create rules, not functions to solve this.

There are a number of ways to approach this, I'll provide two.

Recursion

One way is to recurse over the list, keeping a running count of occurrences, and with each call recording what the current max is, an example of use of an accumulator:

% find the X with the most occurrences N in a list L
occ(X,N,L) :-
    occ(L,max(null,0),[],max(X,N)).

%% occ(+L, +CurrentMax, +Counts, +FinalMax) is det.
%
% recurse through L, using CurrentMax accumulator to
% store current candidate as a term `max(X,N)`
%
% Counts is a list in which we accumulate counts of
% occurrences to far, as list of element-count pairs X-N
%
% The final argument is unified with the CurrentMax
% accumulator as the base case

occ([], max(Xm, Nm), _, max(Xm, Nm)).

occ([X|L], max(Xm, Nm), Counts, FinalMax) :-

    % get the current count of X
    (   select(X-N, Counts, CountsT)
    ->
        N1 is N+1
    ;
        N1 = 1,
        CountsT = Counts),

    % make a new list of counts with the
    % original entry for X replaced by a new
    % one with count of N+1
    Counts2 = [X-N1 | CountsT],

    % recurse, using either new current best candidate
    % or same one depending on whether count is exceeded.
    % in case of tie, use same one, thus prioritizing first result
    (   N1 > Nm
    ->  
        occ(L, max(X,N1), Counts2, FinalMax)
    ;
        occ(L, max(Xm,Nm), Counts2, FinalMax)).

Example:

?- occ(X,N,[a,b,c,a,b]).
X = a,
N = 2.

Higher-order aggregate operations

An alternative approach is to use higher-order aggregate predicates. Arguably this leads to more declarative code, although tastes will vary. If you are using SWI-Prolog you can use the aggregate library. We can start with a rule to count occurrences in a list (note I'm going to switch from your occ/2 to more explicit predicates here):

% count number N of instances of X in a list L
element_count(X,N,L) :-
    aggregate(count,member(X,L),N).

If you don't want to or can't use aggregate/3 then have a look at the answers to this question previously asked on stack overflow.

Next we can use aggregate/3 to find the maximum number for N, plus a "witness" (i.e. the value of X with the highest value):

% count number N of instances of X in a list L, for highest N
max_element_count(X,N,L) :-
    aggregate(max(N1,X1),element_count(X1,N1,L),max(N,X)).

(I'll leave it to you to make an equivalent implementation of this rule if you're not using the aggregate library)

Let's try it:

?- max_element_count(X,N,[a,b,c,a,a,a,b]).
X = a,
N = 4.

With a tie it seems to satisfy your criterion of using the first occurrence in the case of tie-breakers:

?- max_element_count(X,N,[a,b,c,a,b]).                                                                                                                                                         
X = a,
N = 2.

But this is not in fact guaranteed - we just happen to choose a here as it is alphabetically before b. Let's try:

?- max_element_count(X,N,[b,a,c,a,b]).
X = a,
N = 2.

Oops!

This time we will find the first member of the list whose number of occurrences is equal to the max:

max_element_count2(X,N,L) :-
    aggregate(max(N1),X1,element_count(X1,N1,L),N),
    member(X,L),
    element_count(X,N,L),
    !.

This assumes that member/2 will unify with elements in order, which is the behavior I have always seen with Prologs but don't know off the top of my head if it is mandated by the standard.

To demonstrate:

?- max_element_count2(X,N,[b,a,c,a,b]).
X = b,
N = 2.

Prolog find the element with the max number of occurrences

1 Answers1

Recursion

Higher-order aggregate operations