I'm working with a jison file and converting it to a parser generator using the lex module from python PLY.
I've noticed that in this jison file, certain tokens have multiple rules associated with them. For example, for the token CONTENT
, the file specifies the following three rules:
[^\x00]*?/("{{") {
if(yytext.slice(-2) === "\\\\") {
strip(0,1);
this.begin("mu");
} else if(yytext.slice(-1) === "\\") {
strip(0,1);
this.begin("emu");
} else {
this.begin("mu");
}
if(yytext) return 'CONTENT';
}
[^\x00]+ return 'CONTENT';
// marks CONTENT up to the next mustache or escaped mustache
<emu>[^\x00]{2,}?/("{{"|"\\{{"|"\\\\{{"|<<EOF>>) {
this.popState();
return 'CONTENT';
}
In another case, there are multiple rules for the COMMENT
token:
<com>[\s\S]*?"--}}" strip(0,4); this.popState(); return 'COMMENT';
<mu>"{{!--" this.popState(); this.begin('com');
<mu>"{{!"[\s\S]*?"}}" strip(3,5); this.popState(); return 'COMMENT';
It seems easy enough to distinguish the rules when they apply to different states, but what about when they apply to the same state?
How can I translate this jison to python rules using ply.lex?
edit
In case it helps, this jison file is part of the handlebars.js source code. See: https://github.com/wycats/handlebars.js/blob/master/src/handlebars.l