6

I want to create a DCG that languages like this get accepted:

  • c
  • bbbcbbb
  • bbacbba
  • abacaba
  • aababacaababa

As you can see this means that there is a specific order of a and b, then one c and then again the exact same order as before the c. If these conditions are not met it shall fail.

I am currently here with my approach (works, but also recognizes wrong words)

s --> x, s, x. 
s --> [c]. 
x --> [a]. 
x --> [b]. 

Can someone of you help me out what I need to change? I don't know how to go on. Thanks a lot.

Waylander
  • 825
  • 2
  • 12
  • 34

2 Answers2

6

A DCG is really just a Prolog program, preprocessed adding hidden arguments that implement a difference list. We can always add our own parameters, and use pattern matching. Then

s --> ab(X), [c], ab(X). 
ab([a|T]) --> [a], ab(T).
ab([b|T]) --> [b], ab(T).
ab([]) --> [].

?- atom_chars(aababacaababa,Cs),phrase(s, Cs).
Cs = [a, a, b, a, b, a, c, a, a|...] 
false
  • 10,264
  • 13
  • 101
  • 209
CapelliC
  • 59,646
  • 5
  • 47
  • 90
5

The language you are describing is neither regular nor context free. So you need to resort to the Prolog extensions that are offered in DCGs. There are some idioms you might get used to:

% any sequence

seq([]) -->
   [].
seq([E|Es]) -->
   [E],
   seq(Es).

With this non-terminal, we might describe a sequence that is repeated and separated by one character:

rep(Seq, Sep) --> 
   seq(Seq),
   [Sep],
   seq(Seq).

That is clearly too general. You only wanted ab and c. You may now add further requirements:

rep(Seq, Sep) -->
   seq(Seq),
   {phrase(abs,Seq)},
   [Sep],
   seq(Seq).

 abs --> [] | ("a"|"b"), abs.

So now:

 s -->
     rep(_,c).

The alternative is to "hard code" the grammar, as @CapelliC has shown. Using seq//1 makes the approach a bit more flexible.

It is quite convenient to use double quotes for list of characters. See this answer how to permit to use double-quotes to represent a list of chars.

Community
  • 1
  • 1
false
  • 10,264
  • 13
  • 101
  • 209
  • This design also nicely generates solutions via `phrase(s, L).` – lurker Feb 11 '14 at 17:31
  • @mbratch: This is more luck than intention. The canonical way to see all solutions is `?- length(L, N), phrase(s, L).`. Now, they also sorted by length. See, http://stackoverflow.com/questions/6524722/prolog-combining-dcg-grammars-with-other-restrictions/6525089#6525089 – false Feb 11 '14 at 17:36
  • Lefty Gomez said, "I'd rather be lucky than good." ;) But regardless, it's a nice serendipity that the above code generates the "a, b" sequences in order of length and completes each length's set of permutations before moving on to the next length. I'm still learning a lot of the nuances of DCGs, so this and @CapelliC's post are both very instructive to me. – lurker Feb 11 '14 at 17:49
  • 3
    @mbratch: There are two different notions here: termination (meaning universal termination) and fair enumeration. In case of conflict, always go for termination. A DCG that terminates for all fixed lengths can be trivially used to enumerate all sentences. But the inverse is not true. – false Feb 11 '14 at 17:53