Pack consecutive duplicates of list elements into sublists

Question

I need some help. I searched through the database and I found one question already asked about this example, but the answers didn't really help me, so I thought to post my own question.

The task is to pack consecutive duplicates of list elements into sublists:

% ?- pack([a,a,a,a,b,c,c,a,a,d,e,e,e,e],X).
% X = [[a,a,a,a],[b],[c,c],[a,a],[d],[e,e,e,e]].

Here is what I got:

pack([], []).
pack([X], [[X]]).
pack(Liste, Ergebnis):-
   Liste = [H, T|TS],
   H \= T,
   pack([T|TS], Ergebnis1),
   append([[H]], Ergebnis1, Ergebnis).
pack([H, H|HS], Ergebnis):-
   pack([H|HS], Ergebnis1),
   append([H|HS], Ergebnis1, Ergebnis).

The first case works really well (case where H \= T). The second one doesn't, and I really don't know why. Could someone please help me and explain the problem according my solution?

Thanks

... consecutive duplicates ...: Why is `b` a consecutive duplicate in `[a,a,a,a,b,c,c,a,a,d,e,e,e,e]`? — false, May 14 '14 at 17:15
b isn't a consecutive duplicate, it's only an example to show how the result should look like. — , May 14 '14 at 17:16
But your problem statement only talks about consecutive duplicates in the second argument. So why is `b` there? — false, May 14 '14 at 17:19

Sergey Kalinichenko · Accepted Answer · 2014-05-14T17:32:48.337

4

Your forth clause tries to append [H|HS] to the result, which is incorrect, because [H|HS] is the tail of the original list. You can make it a lot simpler -

pack([H, H|HS], [[H|TFR]|TR]):-
    pack([H|HS], [TFR|TR]).

Essentially, it says that when the first two entries are the same in the input, the first entry (i.e. H) needs to be pre-pended to the first entry of the output list produced by recursive invocation of the pack rule.

Note that the third clause can be simplified as well by replacing the Liste parameter which you "crack" right away with the "cracked" version "inlined" into the header of the clause, and doing the same to the output variable Ergebnis1. The final version should look like this:

pack([H, T|TS], [[H]|TR]):-
    H \= T,
    pack([T|TS], TR).

Here is a demo on ideone.

edited May 14 '14 at 17:32

answered May 14 '14 at 17:24

Sergey Kalinichenko

714,442
84
1,110
1,523

hey :D thank you for your help. Can you explain what you mean by this [[H|TFR]|TR] to me? I do not "really" understand what the program does. Does TFR mean Tail from Rest? Sometimes I think to complicated when I'm trying to learn Prolog. – May 14 '14 at 17:40
@user3637666 `[[H|TFR]|TR]` is a list. Its first entry is a list, too. The first entry of that "inner list" is `H`, the head of the input list (which also equals the second entry of the input list); the tail of the "inner list" is `TFR`, which is the head entry returned from the recursive invocation. The tail of the *outer* list is `TR`, i.e. the tail of the output list returned from the recursive invocation. – Sergey Kalinichenko May 14 '14 at 17:43

score 3 · Answer 2 · edited May 15 '14 at 08:37

3

The first two clauses are trivial cases and look good.

The third clause seems reasonable. It says that if I pack [H,T|TS] and H \= T, then the result would be [H] prepended to the result of [T|TS] being packed.

The forth clause is the problem:

pack([H, H|HS], Ergebnis):-
    pack([H|HS], Ergebnis1),
    append([H|HS], Ergebnis1, Ergebnis).

It says that the result of packing [H,H|HS] is the same as packing [H|HS] and then prepending all of [H|HS]. That doesn't seem logical. What's going to happen when you pack [H|HS] is you're going to get something that looks like: [[H,...,H]|X] and now you want to put another H into the head of the first sublist. So you should have:

pack([H, H|HS], Ergebnis):-
    pack([H|HS], [HH|T]),
    Ergebnis = [[H|HH]|T].

You can make these predicates a little more compact:

pack([], []).
pack([X], [[X]]).
pack([H,T|TS], [[H]|PTS]):-
   H \= T,
   pack([T|TS], PTS).
pack([H, H|HS], [[H|HH]|T]):-
   pack([H|HS], [HH|T]).

Testing...

| ?- pack([a,a,a,a,b,c,c,a,a,d,e,e,e,e],X).

X = [[a,a,a,a],[b],[c,c],[a,a],[d],[e,e,e,e]] ? a

no
| ?-

edited May 15 '14 at 08:37

Will Ness

70,110
9
98
181

answered May 14 '14 at 17:30

lurker

56,987
9
69
103

when you say Ergebnis = [[H|HH]|T], what does this mean? I don't know this form of syntax. Are you saying that the Head contains a head and a Tail? – May 14 '14 at 18:02
@user3637666 It means you have a list whose head is a list, `[H|HH]`, and its tail is `T`. In Prolog, you can embed as many lists as you like. So `[H|T]` is a list with head `H` and tail `T`. If `H` itself is a list (such as in a list of lists), then `H` would either be empty `[]` or a list itself consisting of a head and a tail. – lurker May 14 '14 at 19:51
the shorter version having also a big advantage of being [TRMC](http://en.wikipedia.org/wiki/Tail_call#Tail_recursion_modulo_cons) - potentially. I wonder how is it *really* compiled... – Will Ness May 15 '14 at 09:08
@WillNess: The problem in current implementations with above code is indexing. But otherwise, they all use constant space "for the recursion". – false May 15 '14 at 11:04
@lurker : thank you very much. Can you give me a tip how I could come straight to solutions when programming in Prolog? As I said: often I think that I think to complicated. I think I would never have come to your solution with: [[H|TFR]|TR]. – May 15 '14 at 11:16
@false thanks. Makes perfect sense too. I keep thinking about it in terms of manual rewrites with boxes and pointers, but in Prolog it all comes naturally. – Will Ness May 15 '14 at 12:27
@user3637666 there are two aspects of it. Very generally, a starting point in thinking about predicates is to try to express them in logical but natural language for the first level "what makes sense". As far as something like `[[H|TFR]|TR]`, that's just a syntactic thing. We're dealing with lists of lists here, so a list, which is either empty (`[]`) or a head and tail (`[H|T]`, where `H` is an element and `T` is the rest of the list) would imply that `H` itself is a list, so it can also be of the form `[H|T]`. Mash them together and you get `[[H|TFR]|TR]`. – lurker May 15 '14 at 12:38
@lurker what should I change, if I do not want to put in an element in a sublist which is not consecutive? – Jun 09 '16 at 18:54
@johnjohn if you're asking about a variation of this problem, you should open a new question and show what you've tried and you will get some answers. – lurker Jun 09 '16 at 19:29
@lurker I solved it but I do not know how to explain it. I changed the 3rd predicate to : `group([H,T|TS], [H|PTS]):- H \= T, group([T|TS], PTS).` Could you explain to me in a few words why this works? – Jun 09 '16 at 19:34
Your change doesn't quite work. For example, the query: ` pack([a,a,a,b], L).` results in, `L = [[a,a|a],[b]]`. Note in this result, instead of the first element being `[a,a,a]` you have `[a,a|a]` which isn't the same thing. `a` is being used as a tail here, but `a` is not a list. For example, if you did, `member(b, [a,b|c])` you would get "true", but `member([c, [a,b|c])` would yield "false". The flaw is that your change sometimes treats a non-list element as a list. Why did you make the change? What do you want to achieve by it? – lurker Jun 09 '16 at 19:55
@lurker I tested to the query which was given in the exercise and it worked for that. I did not test it for other cases. That was the predicate which I tested with the altered code above `?- group([1,1,1,1,2,c,c,1,1,d,e,e,e,e],X). ` – Jun 12 '16 at 18:38
1

@johnjohn it might work for a couple of examples, but I am saying that it's an erroneous implementation since it will produce incorrect results for other valid cases. Just working for a couple of examples doesn't make code correct. – lurker Jun 12 '16 at 18:41

score 2 · Answer 3 · edited May 23 '17 at 12:26

Every of the other implementations given in answers to this question is logically impure. They are not monotone and thus easily become logically unsound when used with non-ground terms.

The implementation suggested here is logically pure. It uses the meta-predicate splitlistIfAdj/3, which is based on if_/3, as proposed by @false in an answer on reification.

So let's use splitlistIfAdj/3 in tandem with dif/3 which is a reified variant of dif/2:

?- splitlistIfAdj(dif,[a,a,a,a,b,c,c,a,a,d,e,e,e,e],Lists).
Lists = [[a,a,a,a],[b],[c,c],[a,a],[d],[e,e,e,e]]. % succeeds deterministically

As we stay monotone here, we get meaningful answers even if we use non-ground terms:

?- splitlistIfAdj(dif,[A,B,C],Lists).
A = B,    B = C,    Lists = [[C, C, C]]    ;
A = B,    dif(B,C), Lists = [[B, B], [C]]  ;
dif(A,C), B = C,    Lists = [[A], [C, C]]  ;
dif(A,B), dif(B,C), Lists = [[A], [B], [C]].

score 1 · Answer 4 · answered May 14 '14 at 17:31

I would approach the problem a little differently and decompose the problem into two simpler tasks.

We first need to be able to partition the list into a prefix consisting of the list of consecutive elements at the head of the list and its suffix (everything else):

partition( []      , []    , []     ) .
partition( [X]     , [X]   , []     ) .
partition( [X,Y|Z] , [X]   , [Y|Z]  ) :- X \= Y .
partition( [X,Y|Z] , [X|P] , S      ) :- X  = Y , partition([Y|Z],P,S) .

Once we have that, the rest is easy:

pack([],[]) .               % packing an empty list yields the empty list
pack([X|Xs],[P,Ps]) :-      % packing a non-empty list consists of
  partition([X|Xs],P,T) ,   % - partitioning it into its prefix and suffix, and
  pack(T,Ps)                % - packing the suffix
  .                         %

I think it should be `pack([X|Xs],[P|Ps]) :- ...`. – Will Ness May 15 '14 at 09:12 — Will Ness, May 15 '14 at 09:12

score 1 · Answer 5 · edited Jun 05 '15 at 11:16

1

To avoid append, you can use dcg:

pack(L, LP) :-                  % pack/2
    phrase(pack(L), LP).

pack([]) --> [].                % pack//1
pack([X]) --> [[X]].
pack([H, H1 | T]) -->
     {H \= H1},
     [[H]],
     pack([H1 | T]).
pack([H, H | T]) -->
    pack([H | T], [H]).

pack([H, H | T], P) -->         % pack//2
    pack([H | T], [H|P]).
pack([H, H1 | T], P) -->
    {H \= H1},
    [[H | P]],
    pack([H1 | T]).
pack([H, H], P) -->
    [[H,H|P]].

edited Jun 05 '15 at 11:16

repeat

18,496
4
54
166

answered May 14 '14 at 18:49

joel76

5,565
1
18
22

2

If you used `dif/2` instead of `(\=)/2`, your code would perserve [tag:logical-purity]! – repeat Oct 04 '15 at 12:24

Pack consecutive duplicates of list elements into sublists

5 Answers5