Parsing grammar using DCG [SICStus]

Question

So far,a wide syntax that I have to parse it (in order to create a Syntax Analyzer), the problem is that I got redundancy somewhere in code, but I dont know where is it.

part of Grammar ;

Grammar_types

Type :: =     Basic_Type
            | "PP" "(" Type ")"
            | Type "*" Type
            | "struct" "("    (Ident ":" Type)+","    ")"
            | "(" Type ")" .


Basic_Type :: = "ZZ"| "BOOL" | "STRING" | Ident .

I try to analyze this gramar without DCG , example to parse Id :: = Id ((Id) * ",") *

Example_1

"id","id_0(id1,id2,..)"

Code_1

Entete_ (ID, Id, Ids) - atom_concat(XY,')', ID), 
                        atom_concat(XX,Ids, XY),check_ids(Ids),
                        atom_concat(Id,'(',XX),check_id(Id) ,!.
...

but during some searches , I found that DCG is one of the most effective parsers, so I come back to got the code below ;

Code_2

type(Type) -->     "struct(", idents_types(Type),")"
                 | "PP(",ident(Type),")"
                 | "(",type(Type),")"
                 | type(Type),"*",type(Type)
                 | basic_type(Type)
                 | "error1_type".

...

Example_Syntaxe ;

"ZZ" ; "PP(ZZ*STRING)" ; "struct(x:personne,struct(y:PP(PP))" ; "ZZ*ZZ" ...

Test

| ?- phrase(type(L),"struct(aa:struct())").
! Resource error: insufficient memory
% source_info

I think that the problem over here (idents_types)

| ?- phrase(idents_types(L),"struct(aa:STRING)").
    ! Resource error: insufficient memory

Expected result

     | ?- type('ZZ*struct(p1:STRING,p2:PP(STRING),p3:(BOOL*PP(STRING)),p4:PP(personne*BOOL))').
p1-STRING
STRING
p2-PP(STRING)
STRING
p3-(BOOL*PP(STRING))
STRING
BOOL
p4-PP(personne*BOOL)
BOOL
personne
ZZ
yes

So my question is, why am I receiving this error of redundancy , and how can I fix it?

Please always include a usable **example query**, the system's current answer, and the **expected** answer. — mat, Feb 14 '16 at 11:57

score 2 · Accepted Answer · edited May 23 '17 at 12:23

You have a left recursion on type//1.

type(Type) --> ... | type(Type),"*",type(Type) | ...

You can look into this question for further information.

Top down parsers, from which DCGs borrow, must have a mean to lookahead a symbol that drives the analysis in right direction.

The usual solution to this problem, as indicated from the link above, is to introduce a service nonterminal that left associate recursive applications of the culprit rule, or is epsilon (terminate the recursion).

An epsilon rule is written like

rule --> [].

The transformation can require a fairly bit of thinking... In this answer I suggest a bottom up alternative implementation, that could be worthy if the grammar cannot be transformed, for practical or theoric problems (LR grammars are more general than LL).

You may want to try this simple minded transformation, but for sure it leaves several details to be resolved.

type([Type,T]) --> "struct(", idents_types(Type),")", type_1(T)
                 | "PP(",ident(Type),")", type_1(T)
                 | "(",type(Type),")", type_1(T)
                 | basic_type(Type), type_1(T)
                 | "error1_type", type_1(T).
type_1([Type1,Type2,T]) --> type(Type1),"*",type(Type2), type_1(T).
type_1([]) --> [].

edit

I fixed several problems, in both your and mine code. Now it parses the example...

type([Type,T]) --> "struct(", idents_types(Type), ")", type_1(T)
                 | "PP(", type(Type), ")", type_1(T)
                 | "(", type(Type), ")", type_1(T)
                 | basic_type(Type), type_1(T)
                 | "error1_type", type_1(T).

% my mistake here...
type_1([]) --> [].
type_1([Type,T]) --> "*",type(Type), type_1(T).

% the output Type was unbound on ZZ,etc
basic_type('ZZ') --> "ZZ".
basic_type('BOOL') --> "BOOL".
basic_type('STRING') --> "STRING".
basic_type(Type) --> ident(Type).

% here would be better to factorize ident(Ident),":",type(Type)
idents_types([Ident,Type|Ids]) -->   ident(Ident),":",type(Type),",",
                                     idents_types(Ids).
idents_types([Ident,Type])     -->   ident(Ident),":",type(Type).
idents_types([])               -->   []. 

% ident//1 forgot to 'eat' a character
ident(Id) --> [C], { between(0'a,0'z,C),C\=0'_},ident_1(Cs),{ atom_codes(Id,[C|Cs]),last(Cs,L),L\=0'_}.
ident_1([C|Cs]) --> [C], { between(0'a,0'z,C);between(0'0,0'9,C);C=0'_ },
                    ident_1(Cs).
ident_1([])     --> [].

thnks, it means that I must to change the DCG by other LR tool or add eps -> [] for all predicates and added : - op (N, xfx, --->) ?? — Ans Piter, Feb 14 '16 at 13:51
Well, I mixed several topics in my answer, guess some of them could be too much general. I would suggest to try my last snippet (it's an implementation of the general transformation for direct left recursive rules you find in Wikipedia), and, if working for you, adapt the syntax tree construction. — CapelliC, Feb 14 '16 at 13:54
it's the service predicate... it's called A' on Wikipedia page, while A would be type//1 — CapelliC, Feb 14 '16 at 14:02
it's work now but without DCG :( ,`| ?- type('ZZ*struct(p1:STRING,p2:PP(STRING),p3:(BOOL*PP(STRING)))'). p1-STRING STRING p2-PP(STRING) STRING p3-(BOOL*PP(STRING)) STRING BOOL ZZ yes` — Ans Piter, Feb 15 '16 at 16:51
`type_1` still my prb, suppose that we add another type like "`Type \/ Type` or `Type /\ Type`" ? — Ans Piter, Feb 15 '16 at 19:47
both should be possible, only left recursion is problematic. — CapelliC, Feb 15 '16 at 19:49
we replace * by \ / or / \! like `type 1 (Type [T]) -> "\ /" type (Type), type_1 (T). type_1 (Type [T]) -> "/ \" type (Type), type_1 (T).` — Ans Piter, Feb 15 '16 at 19:50

Parsing grammar using DCG [SICStus]

1 Answers1