3

Can someone please tell me how to find the mode of a list in Prolog?

I have tried this:

count_elt([], _, 0) :- !. 
count_elt([H|T], H, P) :- 
  count_elt(T, H, P1),
  P is P1 + 1, !. 
count_elt([_|T], E, P) :- count_elt(T, E, P). 

listMode([], 0) :- !. 
listMode([_], 1) :- !. 
listMode([H1|[H2|T]], M) :- 
  count_elt([H1|[H2|T]], H1, C),
  listMode([H2|T], C1),
  C > C1,
  M is C, !. 
listMode([_|[H2|T]], M) :- listMode([H2|T], M).

which only returns the maximum occurrences in the list. But I need to return the element which has the maximum occurrence (The most frequent value in the list).

Erik Kaplun
  • 37,128
  • 15
  • 99
  • 111
Dinithi De Silva
  • 1,142
  • 5
  • 28
  • 46

3 Answers3

3

You're actually quite close with count_elt/3, but you need more non-determinism. In short, you need to find a way to express count_elt/3 with fewer cuts, because right now you get this:

?- count_elt([a,b,a,a,c,b,d,e,a,f], Y, X).
Y = a,
X = 4.

And what you want to get is this:

?- count_elt([a,b,a,a,c,b,d,e,a,f], Y, X).
Y = a,
X = 4 ;
Y = b,
X = 2 ;
Y = c,
X = 1 ;
...
Y = f,
X = 1 ;
false.

From there you're just trying to find the solutions with the maximum value, which you can do with setof/3 or a logical expression or using the aggregate library. So fix count_elt/3 first and go from there.

Edit: some general remarks:

  • You can write [H1|[H2|T]] as [H1,H2|T], which is a bit more clear.
  • listMode/2 should probably be returning the item in the second position rather than the count. Since you need the count to do this procedure, you're probably going to need to make a listMode/3 or listMode/5 helper to manage your state during the recursion.

Edit: solution

Since @MaDu_LK has decided to show the solution, even though this is most likely homework, I thought I'd share mine, using @false's reified equality predicate:

count_of([],         _,  0).
count_of([H|Rest], E, N1) :- 
  equality_reified(H, E, Bool),
  count_of(Rest, E, N0),
  (Bool == true -> N1 is N0 + 1 ; N1 = N0).

frequency(L, I, N) :-
  sort(L, LSorted),
  member(I, LSorted),
  count_of(L, I, N).

mode(L, X) :-
  frequency(L, X, NMax),
  \+ (frequency(L, _, NBigger),
      NMax < NBigger).

This has somewhat more pleasing performance properties:

% MaDu_LK
?- time(get_mode([a,b,c,a,b,c,a,a,b,c,a,d,b], X)).
% 2,811 inferences, 0.000 CPU in 0.000 seconds (100% CPU, 7117429 Lips)
X = a.

% mine
?- time(mode([a,b,c,a,b,c,a,a,b,c,a,d,b], X)).
% 217 inferences, 0.000 CPU in 0.000 seconds (98% CPU, 3144928 Lips)
X = a ;
% 195 inferences, 0.000 CPU in 0.000 seconds (97% CPU, 3305085 Lips)
false.

The other solution also produces only one mode, even if the list is multimodal:

% MaDu_LK
?- get_mode([a,a,b,b,c], X).
X = a.

% mine
?- mode([a,a,b,b,c], X).
X = a ;
X = b ;
false.

Food for thought.

Community
  • 1
  • 1
Daniel Lyons
  • 22,421
  • 2
  • 50
  • 77
  • count_of/3 could be correct and efficient using [some reifying equality](http://stackoverflow.com/questions/13664870/reification-of-term-equality-inequality). Currently it is neither nor. – false Feb 05 '13 at 19:17
  • @false have a look at the edit and let me know if I'm with you. – Daniel Lyons Feb 05 '13 at 20:23
  • No! you are still creating unnecessary choice points. The idea of equality_reified is to permit indexing, by using it *once* and then indexing on the result. In this manner there is no choice point as long as terms are either = or dif. Only if they are in between would one pay. – false Feb 05 '13 at 22:06
  • Fine! What is still lacking (but this is only for complete perfection), is tail-recursiveness. – false Feb 05 '13 at 23:35
  • Thanks! I'd have to rewrite with an accumulator arg to get tail recursiveness, right? – Daniel Lyons Feb 06 '13 at 02:55
  • Yes. But ideally - very ideally - the arithmetics is set in front of the recursion such that termination could be improved using clpfd ; and: compilation would change it into the perfect code... – false Feb 06 '13 at 13:58
1

You already got good advices from Daniel, about your code. I'll show a library(aggregate) alternative way to obtain the information:

mode_of_list(L, M) :-
    setof(C-E, (member(E, L), aggregate(count, member(E, L), C)), M).

test

?- mode_of_list([a,b,a,a,c,b,d,e,a,f],L).
L = [1-c, 1-d, 1-e, 1-f, 2-b, 4-a].
CapelliC
  • 59,646
  • 5
  • 47
  • 90
  • Counterexamples: `mode_of_list([],M).` should succeed, but your definition fails. `mode_of_list([a,X],M).` has two independent answers. – false Feb 04 '13 at 18:11
  • 2
    Why should the empty list have a mode? – Daniel Lyons Feb 04 '13 at 18:12
  • the behaviour in presence of variables is *rather* complex. OTOH I would classify as advanced the use of lists with variables... – CapelliC Feb 04 '13 at 18:26
-1

Hope you already got number of suggestions. Here this is working code for you.

count_elt([],_,0):-!.
count_elt([H|T],H,C):-count_elt(T,H,C1),C is C1+1,!.
count_elt([_|T],E,C):-count_elt(T,E,C).

listMode([],0) :- !.
listMode([_],1) :- !.
listMode([H1|[H2|T]], M) :-count_elt([H1|[H2|T]],H1,C),listMode([H2|T],C1),C > C1,M is C,!.
listMode([_|[H2|T]],M) :- listMode([H2|T],M).

get_mode([H|T],M):-listMode([H|T],K),count_elt([H|T],M,K).
Madusanka
  • 2,968
  • 5
  • 30
  • 41