1

I have a source that has embed information in comment. For example

//IP x = 3
//IP y = 20

Normally, "//" is marked for comment, but "//IP" is used for indicating set setup information.

How can I parse the comment to get value for x and y?

I may be able to have these lexer rules, but I'm not sure about the action part. Can I extract BASIC_IDENTIFIER values?

BASIC_IDENTIFIER
   :    ('a'..'z' | 'A'..'Z') ( '_' |  ('a'..'z' | 'A'..'Z') |  ('0'..'9') )*
   ;

IP_COMMENT
  : '//IP' (BASIC_IDENTIFIER\s?'='\s?BASIC_IDENTIFIER) ( ~'\n' )* {???}
  ;  

COMMENT
  : '//' ( ~'\n' )* {$channel=HIDDEN;}
  ;
prosseek
  • 182,215
  • 215
  • 566
  • 871
  • 1
    Why not create a `--IP` token and use a parser rule to match such doc-comments? Creating a single token from `"//IP x = 3"` will make it harder to extract information from it at a later stage. – Bart Kiers Jan 01 '12 at 19:00

2 Answers2

2

Header and Member

// START:members
@header {
using System.Collections.Generic;
}

@members {
public static Dictionary<string, string> memory = new Dictionary<string, string>();
}

Grammar Rule Change

DECIMAL_LITERAL
   :    ('0'..'9') ( '_' |  ('0'..'9') )* ( ( '.' ('0'..'9') ( '_' |  ('0'..'9') )* )? ( EXPONENT )? )
   ;

BASIC_IDENTIFIER
   :    ('a'..'z' | 'A'..'Z') ( '_' |  ('a'..'z' | 'A'..'Z') |  ('0'..'9') )*
   ;

IP_COMMENT
  : '--IP' (' ')+ (id = BASIC_IDENTIFIER) (' ')* '=' (' ')* (val = DECIMAL_LITERAL| var = BASIC_IDENTIFIER) ( ~'\n' )* {VHDLParser.memory[$id.text] = $val.text; $channel=HIDDEN;}
  ;  

COMMENT
  : '--' ( ~'\n' )* {$channel=HIDDEN;}
  ;

Now the parsed value is in Dictionary, so you can get the key/value pair.

foreach (KeyValuePair<string, string> kvp in VHDLParser.memory)
{
    Console.WriteLine("{0} - {1}", kvp.Key, kvp.Value);
}
prosseek
  • 182,215
  • 215
  • 566
  • 871
  • 1
    @Bart : With "--IP", "--IP 3 =", "--IP x =" examples, everything seems to work fine. – prosseek Dec 31 '11 at 18:57
  • indeed: I tested it with v3.3, and you're right: it works just fine! I'm pretty sure this used to cause problems with some older v3 version. Good to know! :) – Bart Kiers Jan 01 '12 at 18:56
0

You need to prioritize your tokens in the lexer. this answer has a discussion of that. So make "//IP" a higher priority than "//".

Community
  • 1
  • 1
Francis Upton IV
  • 19,322
  • 3
  • 53
  • 57