Prolog - deleting pairs with the same first value from list

Question

i have list of objects like this

list([obj(x,y),obj(x,z),obj(a,b),obj(b,c)]).

and i would like to remove those elements that share the same first value, so i can work with the modified list. In this case the final list would look like this

list([obj(a,b),obj(b,c)]

Could anyone help please? I am really struggling with this one.

check [this](https://stackoverflow.com/a/61096861/849891) out. :) — Will Ness, Apr 08 '20 at 09:30

David Tonhofer · Answer 1 · 2020-04-07T19:16:08.740

Let's start with the tests!

% Testing

:- begin_tests(collapse).   

test(one)   :- collapse([],[]).
test(two)   :- collapse([obj(a,b)],[obj(a,b)]).
test(three) :- collapse([obj(a,b),obj(b,c)],
                        [obj(a,b),obj(b,c)]).                        
test(four)  :- collapse([obj(a,b),obj(a,c),obj(b,j)],
                        [obj(b,j)]).
test(five)  :- collapse([obj(a,b),obj(a,c),obj(b,j),obj(a,x),obj(b,y)],
                        []).
test(six)   :- collapse([obj(a,b),obj(a,c),obj(b,j),obj(b,y),obj(c,x)],
                        [obj(c,x)]).

:- end_tests(collapse).

rt :- run_tests(collapse).

Then code:

% This is called

collapse(Lin,Lout) :- collapse(Lin,[],Lout).

/*
 * Helper predicate:
 * collapse(List_over_which_we_recur_getting_smaller,
 *          Elements_which_we_have_already_seen,
 *          List_which_collects_the_result_going_down,
 *          List_which_collects_the_result_coming_up).
 */

collapse([],_Filter,[]).  % base case, kick a [] upwards; don't care about Filter

collapse([obj(A,_)|Objs],Filter,Lup) :- 
   (member(obj(A,_),Objs);member(obj(A,_),Filter)),     % Does the obj(A,_) appear elsewhere (in Filter or Objs)?
   !,                                                   % Commit to this execution path where obj(A,_) is not unique
   (member(obj(A,_),Filter)                             % Slight improvement: add obj(A,_) to "Filter" only it it's not yet in there
       -> NewFilter = Filter
       ;  NewFilter = [obj(A,_)|Filter]),
   collapse(Objs,NewFilter,Lup).                        % Do not retain obj(A,_)

collapse([obj(A,X)|Objs],Filter,Lup) :- 
   \+(member(obj(A,_),Objs);member(obj(A,_),Filter)),   % Does the obj(A,_) appear elsewhere (in Seen or ToSee)?
   !,                                                   % Commit to this execution path where obj(A,_) IS unique   
   collapse(Objs,Filter,Ltmp),                          % Filtering the rest of Objs, which defines Ltmp      
   Lup = [obj(A,X)|Ltmp].                               % DO retain object on the way up, correctly ordering result.

Okay, so:

?- rt.
% PL-Unit: collapse ...... done
% All 6 tests passed
true.

The [Coding Guidelines for Prolog](https://arxiv.org/abs/0911.2899) say one should avoid comments to the right but they are really useful; in fact, there should be dedicated editor support for them. — David Tonhofer, Apr 07 '20 at 19:24
The Scheme/Lisp coding style guidelines demand no extra spacing inside the lines of code (and Prolog's probably does too), but code is a ***2D*** medium, and [vertical](https://stackoverflow.com/a/27804489/849891) [alignment](https://stackoverflow.com/a/60877178/849891) [helps](https://stackoverflow.com/a/11551001/849891), sometimes immensely. :) (including the links here just for the illustration.) — Will Ness, Apr 08 '20 at 09:24

Paulo Moura · Answer 2 · 2020-04-07T18:45:32.003

Solving this problem efficiently is not trivial for a beginner. Assuming that the elements of the list are ground, we can start by noticing that sorting the list will cluster together all elements that shared the first argument in the obj/2 compound term. For example:

| ?- sort([obj(x,y),obj(x,z),obj(a,b),obj(b,c)], S).
S = [obj(a, b), obj(b, c), obj(x, y), obj(x, z)]
yes

The sort/2 is a standard built-in predicate. Any decent Prolog system should implement it with complexity O(n*log(n)). After sorting, we can walk the list, which we can so in O(n) to filter it:

filter(List, Filtered) :-
    sort(List, Sorted),
    walk(Sorted, Filtered).

walk([], []).
walk([obj(X,Y)| Sorted], Filtered) :-
    walk(Sorted, X, obj(X,Y), Filtered).

walk([], _, Element, [Element]).
walk([obj(X,_)| Sorted], X, _, Filtered) :-
    !,
    delete(Sorted, X, Rest),
    walk(Rest, Filtered).
walk([obj(X,Y)| Sorted], _, Element, [Element| Filtered]) :-
    walk(Sorted, X, obj(X,Y), Filtered).

delete([], _, []).
delete([obj(X,_)| Sorted], X, Rest) :-
    !,
    delete(Sorted, X, Rest).
delete(Rest, _, Rest).

Sample call:

| ?- filter([obj(x,y),obj(x,z),obj(a,b),obj(b,c)], Filtered).
Filtered = [obj(a, b), obj(b, c)]
yes

Looks good but we should do more comprehensive testing. We can define a property that all the filter/2 predicate solutions must satisfy:

property(List, Filtered) :-
    filter(List, Filtered),
    % all elements of the output list must
    % be in input list
    forall(
        member(X, Filtered),
        member(X, List)
    ),
    % no two elements in the output list
    % should share the first argument
    \+ (
        select(obj(X,_), Filtered, Rest),
        member(obj(X,_), Rest)
    ),
    % all elements in the input list whose
    % first argument is not repeated must
    % be in the output list
    \+ (
        select(obj(X,Y), List, Rest),
        \+ member(obj(X,_), Rest),
        \+ member(obj(X,Y), Filtered)
    ).

We can now use a property-based testing implementation such as Logtalk's lgtunit QuickCheck implementation. But there's a catch. Property-based testing requires that we be able to generate lists with obj/2 elements. The solution, we cheat! First we do a syntactic transformation from obj(X,Y) to X-Y. This transformation doesn't change the semantics of the predicate being tested:

filter(List, Filtered) :-
    sort(List, Sorted),
    walk(Sorted, Filtered).

walk([], []).
walk([X-Y| Sorted], Filtered) :-
    walk(Sorted, X, X-Y, Filtered).

walk([], _, Element, [Element]).
walk([X-_| Sorted], X, _, Filtered) :-
    !,
    delete(Sorted, X, Rest),
    walk(Rest, Filtered).
walk([X-Y| Sorted], _, Element, [Element| Filtered]) :-
    walk(Sorted, X, X-Y, Filtered).

delete([], _, []).
delete([X-_| Sorted], X, Rest) :-
    !,
    delete(Sorted, X, Rest).
delete(Rest, _, Rest).

We apply the same syntactic transformation to the property/2 predicate:

property(List, Filtered) :-
    filter(List, Filtered),
    % all elements of the output list must
    % be in input list
    forall(
        member(X, Filtered),
        member(X, List)
    ),
    % no two elements in the output list
    % should share the first argument
    \+ (
        select(X-_, Filtered, Rest),
        member(X-_, Rest)
    ),
    % all elements in the input list whose
    % first argument is not repeated must
    % be in the output list
    \+ (
        select(X-Y, List, Rest),
        \+ member(X-_, Rest),
        \+ member(X-Y, Filtered)
    ).

We can now test using the goal:

| ?- lgtunit::quick_check(
         property(
             +list(pair(char,char)),
             -list(pair(char,char))
         )
     ).
% 100 random tests passed
% starting seed: seed(25256,26643,1563)
yes

Note: in the definition of the property/2 predicate, we assume that the de facto standard member/2 and select/3 list predicates are available in user (i.e. at the top-level interpreter). If that's not the case, prefix their calls with list::.

@DavidTonhofer Already there: https://logtalk3.readthedocs.io/en/latest/devtools/lgtunit.html#quickcheck — Paulo Moura, Apr 07 '20 at 19:31
@DavidTonhofer Expanded this answer in the following blog post: https://logtalk.org/2020/04/10/evolving-from-manually-written-tests.html — Paulo Moura, Apr 12 '20 at 21:23

Prolog - deleting pairs with the same first value from list

2 Answers2

Linked