4

Hi I am trying to implement a parser for a simple language with grammar like this.

program ::= "program" declarations "begin" statements "end"
declaration ::= "var" ident "as" type
type ::= "string" | "int"

I have the first two done, how would I write the type grammar?

program( prog( DECLS, STATS ) ) -->
[ 'program' ], declarations( DECLS ),
[ 'begin' ], statements( STATS ), [ 'end' ].

declaration( decl( IDENT, TYPE ) ) -->
[ 'var' ], ident( IDENT ), [ 'as' ], type( TYPE ).
Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • when I change the grammar with the above rules I will be using SICStus to test it..I am just editing the grammar in wordpad with a .sp file – user1794576 Nov 02 '12 at 14:48

3 Answers3

2

Your grammar might be underspecified. In fact you do not define how keywords are separated from other tokens like identifiers. There are programming languages, where you do not need to separate keywords from identifiers. And there are other programming languages where some whitespace or layout character is needed.

In your case, is "varaasint" a valid declaration? Your grammar suggests it. Or do you have to write "var a as int".

You might want to look into this answer for more.

Community
  • 1
  • 1
false
  • 10,264
  • 13
  • 101
  • 209
1
type(string) --> ['string'].
type(int) --> ['int'].

(actually ' are not required)

You could use | or ; but that would complicate the way you return the type you found.

Thanos Tintinidis
  • 5,828
  • 1
  • 20
  • 31
  • I am also trying to do this assign_stmt ::= ident operator expr Can you help? – user1794576 Nov 05 '12 at 15:29
  • @user1794576 sounds pretty straightforward; `assign_stmt(...):- ident(...), operator(...), expr(...).` for the expressions you should remove left recursion from your grammar: https://en.wikipedia.org/wiki/Left_recursion#Removing_left_recursion – Thanos Tintinidis Nov 05 '12 at 16:23
1

You miss statements rule!

Anyway, DCG rules are just plain syntax sugar on Prolog, then you can use any Prolog feature you like. If you need to keep the grammar compact:

type(type(T)) --> [T], {memberchk(T, [int, string])}.

Braces allow to mix generic Prolog with grammar rules.

As @false noted, your grammar is useful only if you have a tokenizer, that splits your input and discards whitespaces. Or you could handle that more directly, using this schema (beware, untested code):

program( prog( DECLS, STATS ) ) -->
   s, "program", s, declarations( DECLS ),
   s, "begin", s, statements( STATS ), s, "end", s.

declaration( decl( IDENT, TYPE ) ) -->
   "var", s, ident( IDENT ), s, "as", s, type( TYPE ).

declarations([D|Ds]) --> declaration(D), declarations(Ds).
declarations([]) --> [].

type(type(int)) --> "int".
type(type(string)) --> "string".

% skip 1 or more whitespace
s --> (" " ; "\n"), (s ; []).
CapelliC
  • 59,646
  • 5
  • 47
  • 90