2

I'm trying to split a list into the items before a specific element (specifically the word "stop") as well as the items after this element. I know you can use split to do this, but I'm new to prolog and so I'm trying to manipulate things without using these functions currently, and so I'd really like to know if this is possible? (and maybe some pointers in the right direction)

i.e. with the list;

L = [tea,coffee,sugar,cake,stop,meat,fish,eggs,flour]

I'd ideally want to split the list at 'stop' leaving me with,

L2 = [tea, coffee, sugar, cake] // and
L3 = [meat, fish, eggs, flour]
Carol
  • 65
  • 1
  • 5
  • 2
    Yes it is possible. – Willem Van Onsem Nov 28 '17 at 09:45
  • @WillemVanOnsem would you be able to point me in the right direction with what I should be aiming to do? Currently, I'm able to just go through a list and work out that a 'stop' occurs but I don't know how to extract a list of the elements on either side? – Carol Nov 28 '17 at 09:50
  • usually you declare predicates in Prolog in a declarative and recursive way. So perhaps you should think about "what should happen in case of an empty list?" "What should we do to split a list, given we have split the tail of the list", etc. – Willem Van Onsem Nov 28 '17 at 09:52
  • @WillemVanOnsem Iknow how to recurse through (if its empty, I'd just want to do nothing with it), and can recognise if the element is a 'stop'. Do I append the elements before the 'stop' into a new list? recurse([]). recurse([H|T]) :- append(H, _, L), \+check_stop(H), recurse(T). – Carol Nov 28 '17 at 10:02
  • do you mean append/3 for "split predicate" ? – CapelliC Nov 28 '17 at 17:14

3 Answers3

5

Using seq//1:

list_splitonstop(Xs, As, Bs) :-
   phrase( ( seq(As), [stop], seq(Bs) ), Xs).

This version works as you expected:

?- L = [tea,coffee,sugar,cake,stop,meat,fish,eggs,flour],
       list_splitonstop(L, L1, L2).
   L = [tea,coffee,sugar,cake,stop,meat,fish,eggs,flour],
      L1 = [tea,coffee,sugar,cake], L2 = [meat,fish,eggs,flour]
;  false.

But, is it really the best solution? This ; false at the end may be an indication that it is not. But we cannot say this for sure. We would have to figure out another case where this solution does not work as expected. You are faced with similar problems also in other programming languages were one has to rely a lot on the imagination of the programmer to find out border cases and the like.

Fortunately, we are here using Prolog which helps us to understand what we are actually defining.

A very simple first start is to ask the most general query. Just like that:

?- list_splitonstop(L, L1, L2).
   L = [stop], L1 = [], L2 = []
;  L = [stop,_A], L1 = [], L2 = [_A]
;  L = [stop,_A,_B], L1 = [], L2 = [_A,_B]
;  L = [stop,_A,_B,_C], L1 = [], L2 = [_A,_B,_C]
; ... .

Look at each answer! Let's take the third as an example. L = [stop,_A,_B] means that this answer includes all lists with three elements where the first is stop. So we are looking here at an infinity of solutions that have been all compactly described with a couple of characters! Not even bzip2 -99 can do that!

Are these the only lists with three elements? We cannot say this from this single query alone, for Prolog might enumerate answers in an unfair manner. Imagine you ask someone to tell you all the natural numbers, but that person starts 0, 2, 4, ... Evidently, that enumeration is very unfair to the odd numbers. Similarly some answers might be missing ...

In Prolog we can insist on looking at lists of length 3 only:

?- L = [_,_,_], list_splitonstop(L, L1, L2).
   L = [stop,_A,_B], L1 = [], L2 = [_A,_B]
;  L = [_A,stop,_B], L1 = [_A], L2 = [_B]
;  L = [_A,_B,stop], L1 = [_A,_B], L2 = []
;  false.

So we can ask for all relevant cases of length 3 in a single query. Note that these _A and _B variables represent any term! Please take a moment and appreciate what you are looking at: All cases for lists of length 3. There are no other cases to consider!

When you look at such answers some questions may arise. Like: Do these three answers overlap, or are they truly disjoint? Prolog knows the answer. Simply repeat the actual goal and count the resulting answers:

?- L = [_,_,_], list_splitonstop(L, L1, L2), list_splitonstop(L, L1, L2).
(answers same as above)

So we get exactly the same answers. There is no inherent redundancy.

Another question might be: Does L has always precisely one possible split? (In other words: Does there exist a functional dependency?)

We can get to this by asking for L that have a different L1 and L2:

?- L = [_,_,_], dif(L1-L2,L1x-L2x),
     list_splitonstop(L, L1, L2), list_splitonstop(L, L1x, L2x).
   L = [stop,stop,_A], L1 = [], L2 = [stop,_A], L1x = [stop], L2x = [_A]
;  L = [stop,_A,stop], L1 = [], L2 = [_A,stop], L1x = [stop,_A], L2x = []
;  L = [stop,stop,_A], L1 = [stop], L2 = [_A], L1x = [], L2x = [stop,_A]
;  L = [_A,stop,stop], L1 = [_A], L2 = [stop], L1x = [_A,stop], L2x = []
;  L = [stop,_A,stop], L1 = [stop,_A], L2 = [], L1x = [], L2x = [_A,stop]
;  L = [_A,stop,stop], L1 = [_A,stop], L2 = [], L1x = [_A], L2x = [stop]
;  false.

So, I may ask you now: Do you want above cases? If there are several occurrences of stop? Clearly, you did not specify this and we need some more information from you. Prolog was at least helpful to identify such cases.


How to identify redundant answers.

In the case above we observed that there are no redundant answers. But how do they show up when they show up? Here is such an example: member/2 which is built-in and produces (sometimes) redundant answers and memberd/2 which does not have this redundancy. The actual question is:

How does a two-element list look like that has e as an element/member?

?- Xs = [_,_], member(e, Xs).
   Xs = [e,_A]
;  Xs = [_A,e].    
?- Xs = [_,_], member(e, Xs), member(e, Xs).
   Xs = [e,_A]
;  Xs = [e,e]                %  <--- redundant
;  Xs = [e,e]                %  <--- redundant
;  Xs = [_A,e].
?- Xs = [_,_], memberd(e, Xs).
   Xs = [e,_A]
;  Xs = [_A,e], dif(_A,e)
;  false.
?- Xs = [_,_], memberd(e, Xs), memberd(e, Xs).
   Xs = [e,_A]
;  Xs = [_A,e], dif(_A,e)
;  false.

If you are only interested in seeing those answers that permit redundancies, you can say in stead:

?- Xs = [_,_], member(e, Xs), \+ \+ call_nth(member(e, Xs), 2).
   Xs = [e,_A]
;  Xs = [_A,e].

In other words, all answers of member/2 permit such redundancies. Note that member/2 is not always prone to redundancy. In particular, if the list contains different (nonunifiable) elements, there are no redundancies at all. And this is a frequent use case.

?- Xs = [a,b], member(X, Xs), \+ \+call_nth(member(X, Xs),2).
   false.

In fact, in this very case, that is when querying for X, member/2 is probably more efficient than memberd/2.

false
  • 10,264
  • 13
  • 101
  • 209
2

As said in a comment, a recursive definition is the most natural way to do this:

split_on_stop([], [[]]).
split_on_stop([stop|T], [[]|T2]) :-
    split_on_stop(T, T2).
split_on_stop([H|T], [[H|T2]|T3]) :-
    dif(H, stop),
    split_on_stop(T, [T2|T3]).

This implementation will work with multiple occurences of stop: the second argument is a list of those splitted lists.

The complicated part to figure out here is the contents of that second argument in all three cases (when the first one is empty, when we find a stop, and when we find something other than a stop).

Note

If it is guaranteed that the input list has one and only one stop, you can shorten this implementation a lot (but it loses a lot of desirable behaviours):

split_on_stop([stop|T], [], T).
split_on_stop([H|T], [H|T2], T3) :-
    dif(H, stop),
    split_on_stop(T, T2, T3).
Community
  • 1
  • 1
Fatalize
  • 3,513
  • 15
  • 25
1

append/3 is your magic sword:

split_on_stop(List, Before, After) :-
    append(Before, [stop|After], List).

No more, no less, so in most cases you just inline the append goal.

Jan Wielemaker
  • 1,670
  • 10
  • 18
  • 3
    This assumes, of course, that for this to be a complete solution there is one and only one `stop` in the list. – lurker Nov 29 '17 at 16:27