Here is a grammar that almost does what you want:
grammar PrintLang;
sentence
: statement
;
statement
: functionCall '(' argument ')' ';'
{
if ($functionCall.funName.equals("printf")) {
System.out.println($argument.arg);
}
}
;
functionCall returns [String funName]
: ID
{ $funName = $ID.text; }
;
argument returns [String arg]
: STRING
{ $arg = $STRING.text; }
;
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
;
WS : ( ' '
| '\t'
| '\r'
| '\n'
) {$channel=HIDDEN;}
;
STRING
: '"' ( ESC_SEQ | ~('\\'|'"') )* '"'
;
fragment
HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ;
fragment
ESC_SEQ
: '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
| UNICODE_ESC
| OCTAL_ESC
;
fragment
OCTAL_ESC
: '\\' ('0'..'3') ('0'..'7') ('0'..'7')
| '\\' ('0'..'7') ('0'..'7')
| '\\' ('0'..'7')
;
fragment
UNICODE_ESC
: '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT
;
I generated this in AntlrWorks. All of the token rules were generated for me.
Here is the java file to test it.
import org.antlr.runtime.*;
public class PrintIt {
public static void main(String args[]) {
String inputString = "printf(\"HelloWorld\");";
// Create an input character stream from standard in
ANTLRStringStream input = new ANTLRStringStream(inputString);
// Create an ExprLexer that feeds from that stream
PrintLangLexer lexer = new PrintLangLexer(input);
// Create a stream of tokens fed by the lexer
CommonTokenStream tokens = new CommonTokenStream(lexer);
// Create a parser that feeds off the token stream
PrintLangParser plParser = new PrintLangParser(tokens);
try {
plParser.sentence();
} catch (Exception e) {
e.printStackTrace();
}
}
}
You'll note that this java code is almost a verbatim copy/paste from the Antlr website example (I don't believe I even changed the comments, which is why the comment refers to Standard in, but the code actually uses a String). And here is the command line I used to do it.
bash$ java -cp ./antlr-3.4-complete.jar org.antlr.Tool PrintLang.g
bash$ javac -cp ./:./antlr-3.4-complete.jar PrintIt.java
bash$ java -cp antlr-3.4-complete.jar:. PrintIt
"HelloWorld"
Oops, I forgot that the string I wanted to print isn't the matched token ("HelloWorld", including the quotes), it's the string within the quotes.
Also, you'll note that I hardcoded the lookup of printf as a string comparison. In reality, you'll want an environment that contains the symbols accessible at a given scope (related, see antlr's "scope" construct. More difficult, though sometimes useful: create an environment that you pass to each parsing rule).
Most important: find Bart Kiers answers by searching SO for more antlr questions. He posts excellent examples.