4

I'm doing an exercise in Prolog and I'm stuck. I need to swap three adjacent items in a list with another three elements.

That is:

| ?- swap([c,g,g,a,t,t,g,c,a,a], X).

X = [a,t,t,c,g,g,g,c,a,a]
X = [g,c,a,a,t,t,c,g,g,a]
X = [c,g,g,g,c,a,a,t,t,a]
X = [c,a,a,a,t,t,g,c,g,g]
.
.
.

This is what I have so far:

swap([H1, H2, H3, H4, H5, H6|T1], X) :-
     X = [H4, H5, H6, H1, H2, H3|T1];
     swap([H2, H3, H4, H5, H6|T1], X);
     swap([H1, H2, H3, H4, H5|T1], X).

And the output for this is:

| ?- swap([c,g,g,a,t,t,g,c,a,a], X).

X = [a, t, t, c, g, g, g, c, a, a] ;
X = [t, t, g, g, g, a, c, a, a] ;
X = [t, g, c, g, a, t, a, a] ;
X = [g, c, a, a, t, t, a] ;
X = [c, a, a, t, t, g] ;
X = [c, a, a, a, t, t] ;
X = [g, c, a, g, a, t, a] ;
X = [c, a, a, a, t, g] ;
X = [c, a, a, g, a, t] ;
X = [t, g, c, g, g, a, a, a] ;
X = [g, c, a, g, a, t, a] ;
X = [c, a, a, a, t, g] ;
X = [c, a, a, g, a, t] ;
X = [g, c, a, g, g, a, a] ;
X = [c, a, a, g, a, g] ;
X = [c, a, a, g, g, a] ;
X = [t, t, g, c, g, g, c, a, a] ;
X = [t, g, c, g, g, t, a, a] ;
X = [g, c, a, g, t, t, a] ;
X = [c, a, a, t, t, g] ;
X = [c, a, a, g, t, t] ;
X = [g, c, a, g, g, t, a] ;
X = [c, a, a, g, t, g] ;
X = [c, a, a, g, g, t] ;
X = [t, g, c, c, g, g, a, a] ;
X = [g, c, a, g, g, t, a] ;
X = [c, a, a, g, t, g] ;
X = [c, a, a, g, g, t] ;
X = [g, c, a, c, g, g, a] ;
X = [c, a, a, g, g, g] ;
X = [c, a, a, c, g, g] ;
false.

The only problem that I have is that with every recursion I lose some part of the list and I don't know how to put it back.

false
  • 10,264
  • 13
  • 101
  • 209
RobertM
  • 71
  • 1
  • 5

3 Answers3

7

It seems you are interested in describing RNA-sequences. Triples, that sounds much like anticodons. To make those sequences more readable, use:

:- set_prolog_flag(double_quotes, chars).

which permits you to write "attac" in place of [a,t,t,a,c]. See this how to get also compact answers.

Now for the swap. The easiest way is to first sketch what you want:

... Triple1 ... Triple2 ...  is the OldSequence

... Triple2 ... Triple1 ...  is the NewSequence

Where the ... are the same for both sequences. All of this can be readily translated using DCGs.

tripleswap(OldSequence, NewSequence) :-
   dif(T1,T2),
   phrase( ( seq(A), triple(T1), seq(B), triple(T2), seq(C) ), OldSequence),
   phrase( ( seq(A), triple(T2), seq(B), triple(T1), seq(C) ), NewSequence).

seq([]) --> [].
seq([B|Bs]) --> [B], seq(Bs).

triple([A,B,C]) --> [A,B,C].

Whenever you distrust a DCG-definition, just try it out with phrase/2. Like

?- phrase(triple(T1), Bs).
T1 = Bs, Bs = [_A,_B,_C].

The non-terminal triple//1 describes a sequence of 3 elements (presumably nucleotides).

seq//1 is an arbitrarily long sequence.

There are solutions with better termination conditions, but they are less readable and often require certain assumptions that are difficult to maintain in the general case. Here is such a simple improvement:

samelength([], []).
samelength([_|Xs], [_|Ys]) :-
   samelength(Xs, Ys).

and add samelength(OldSequence, NewSeqence) as the first goal. Now, tripleswap/2 terminates iff samelength/2 terminates. So one of the arguments should be a list of fixed length.

Also note that I think that "cccccc" has no swap. That's why I added dif(T1,T2).

?- tripleswap("cggattgcaa", Bs).
      Bs = "attcgggcaa"
   ;  Bs = "ttgacggcaa"
   ;  Bs = "tgcatcggaa"
   ;  Bs = "gcaattcgga"
   ;  Bs = "caaattgcgg"
   ;  Bs = "cttgggacaa"
   ;  Bs = "ctgctggaaa"
   ;  Bs = "cgcattggaa"
   ;  Bs = "ccaattggga"
   ;  Bs = "cgtgcgataa"
   ;  Bs = "cggcatgata"
   ;  Bs = "cgcaatggat"
   ;  Bs = "cgggcaatta"
   ;  Bs = "cggcaagatt"
   ;  Bs = "cggacaattg"
   ;  false.

BTW, s are used in Molecular Biology since the 1980s. Start with

David B. Searls, Investigating the Linguistics of DNA with Definite Clause Grammars, NACLP 1989

and other work by the same author as well as Ross Overbeek around that time. All of this happened in the dawn of the Human Genome Project.

false
  • 10,264
  • 13
  • 101
  • 209
  • Hi, thanks for the help. The exercise is about studying the proximity of two species. I managed to do the first step of the exercise (swapping two adjacent nucleotides with the same type with another two, ex: [c,g,g,t] = [c,a,a,t]. This was the second part. I'm new with prolog so I didn't know it could be done with DCG. Can you please explain how does "triple([A,B,C]) --> [A,B,C]." translate to? – RobertM Jan 31 '20 at 17:32
3

Basically this can be split into two subproblems:

  1. first take a sequence of three elements; and
  2. take another sequence of three elements and produce a list where we swapped these.

We can thus implement the two problems as follows:

swap(L, X) :-
    swap1(L, S1, S2, T, X, Q),
    swap2(T, S1, S2, Q).

where L is the list where we need to perform the swaps, X the list that is unified with the results, S1 and S2 the sequences that we select to swap, T the remaining elements after the first selection, and Q the part after the second sequence of the list to swap.

The first swap1 can thus be implemented as:

swap1([A1, A2, A3|T], [A1, A2, A3], [B1, B2, B3], T, [B1, B2, B3|Q], Q).
swap1([A1|T], A, B, R, [A1|Rest], S) :-
    swap1(T, A, B, R, Rest, S).

For the given sample list, this will thus yield:

?- swap1([c,g,g,a,t,t,g,c,a,a], A, [B1, B2, B3], T, X, R).
A = [c, g, g],
T = [a, t, t, g, c, a, a],
X = [B1, B2, B3|R] ;
A = [g, g, a],
T = [t, t, g, c, a, a],
X = [c, B1, B2, B3|R] ;
A = [g, a, t],
T = [t, g, c, a, a],
X = [c, g, B1, B2, B3|R] ;
A = [a, t, t],
T = [g, c, a, a],
X = [c, g, g, B1, B2, B3|R] ;
A = [t, t, g],
T = [c, a, a],
X = [c, g, g, a, B1, B2, B3|R] ;
A = [t, g, c],
T = [a, a],
X = [c, g, g, a, t, B1, B2, B3|R] ;
A = [g, c, a],
T = [a],
X = [c, g, g, a, t, t, B1, B2, B3|...] ;
A = [c, a, a],
T = [],
X = [c, g, g, a, t, t, g, B1, B2|...] ;
false.

Here it thus proposes eight ways to pick three adjacent sequences that can be used to swap.

Then the second swap need to find three adjacent elements in the remaining lists to swap, and put the ones that have been picked by swap1/6 at the places where it picks elements from, like:

swap2([B1,B2,B3|R], [A1,A2,A3], [B1, B2, B3], [A1,A2,A3|R]).
swap2([B1|R], As, Bs, [B1|T]) :-
    swap2(R, As, Bs, T).

For the given sample data, this thus gives us:

?- swap([c,g,g,a,t,t,g,c,a,a], X).
X = [a, t, t, c, g, g, g, c, a, a] ;
X = [t, t, g, a, c, g, g, c, a, a] ;
X = [t, g, c, a, t, c, g, g, a, a] ;
X = [g, c, a, a, t, t, c, g, g, a] ;
X = [c, a, a, a, t, t, g, c, g, g] ;
X = [c, t, t, g, g, g, a, c, a, a] ;
X = [c, t, g, c, t, g, g, a, a, a] ;
X = [c, g, c, a, t, t, g, g, a, a] ;
X = [c, c, a, a, t, t, g, g, g, a] ;
X = [c, g, t, g, c, g, a, t, a, a] ;
X = [c, g, g, c, a, t, g, a, t, a] ;
X = [c, g, c, a, a, t, g, g, a, t] ;
X = [c, g, g, g, c, a, a, t, t, a] ;
X = [c, g, g, c, a, a, g, a, t, t] ;
X = [c, g, g, a, c, a, a, t, t, g] ;
false.

Here the places that are swapped are written in boldface.

Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
2

I think permutation/2 will help:

swap(Es,Sw) :- triples(Es,Ts),permutation(Ts,Sw0),append(Sw0,Sw).

triples([A,B,C|Es],[[A,B,C]|Ts]) :- !, triples(Es,Ts).
triples([],[]) :- !.
triples(R,[R]).

yields

?- swap([c,g,g, a,t,t, g,c,a], X).
X = [c, g, g, a, t, t, g, c, a] ;
X = [c, g, g, g, c, a, a, t, t] ;
X = [a, t, t, c, g, g, g, c, a] ;
X = [a, t, t, g, c, a, c, g, g] ;
X = [g, c, a, c, g, g, a, t, t] ;
X = [g, c, a, a, t, t, c, g, g] ;
false.

note: triples/2 allows for not triple data in tail, but you can drop this (maybe unwanted) feature just deleting the last clause:

triples(R,[R]). % drop this

Then the cuts become useless, just drop drop them:

triples([],[]). % just for style in this case, move to first clause
triples([A,B,C|Es],[[A,B,C]|Ts]) :- triples(Es,Ts).
CapelliC
  • 59,646
  • 5
  • 47
  • 90