I'm having an issue where if my parser finds a token that it cannot place in any rule it ends without explicitly reporting an error, even though there are more tokens left to place afterward. To be exact, the token is actually recognized (I have a rule which is an almost catch-all rule) but the token is misplaced and can't be covered by any rule. In this case, my parser ends successfully without reporting any errors (at least out loud).
This is the case I'm seeing: The code to parse:
.class public final Ld;
.super Ljava/lang/Object;
.source "java-style lambda group"
# interfaces
.implements Landroid/content/DialogInterface$OnClickListener;
<misplaced-tokens>
# static fields
.field public static final f:Ld;
.field public static final g:Ld;
...
(note <misplaced-tokens>
token, which is actually five tokens - see below. I'm expecting parsing to error out here.)
Parsed tokens:
[@0,0:5='.class',<'.class'>,1:0]
[@1,7:12='public',<'public'>,1:7]
[@2,14:18='final',<'final'>,1:14]
[@3,20:22='Ld;',<QUALIFIED_TYPE_NAME>,1:20]
[@4,24:29='.super',<'.super'>,2:0]
[@5,31:48='Ljava/lang/Object;',<QUALIFIED_TYPE_NAME>,2:7]
[@6,50:56='.source',<'.source'>,3:0]
[@7,58:82='"java-style lambda group"',<STRING_LITERAL>,3:8]
[@8,85:96='# interfaces',<LINE_COMMENT>,channel=1,5:0]
[@9,98:108='.implements',<'.implements'>,6:0]
[@10,110:158='Landroid/content/DialogInterface$OnClickListener;',<QUALIFIED_TYPE_NAME>,6:12]
[@11,160:160='<',<'<'>,7:0]
[@12,161:169='misplaced',<IDENTIFIER>,7:1]
[@13,170:170='-',<'-'>,7:10]
[@14,171:176='tokens',<IDENTIFIER>,7:11]
[@15,177:177='>',<'>'>,7:17]
[@16,180:194='# static fields',<LINE_COMMENT>,channel=1,9:0]
[@17,196:201='.field',<'.field'>,10:0]
...
Parsing progress:
enter parse, LT(1)=.class
enter statement, LT(1)=.class
enter classDirective, LT(1)=.class
consume [@0,0:5='.class',<30>,1:0] rule classDirective
enter classModifier, LT(1)=public
consume [@1,7:12='public',<53>,1:7] rule classModifier
exit classModifier, LT(1)=final
enter classModifier, LT(1)=final
consume [@2,14:18='final',<56>,1:14] rule classModifier
exit classModifier, LT(1)=Ld;
enter className, LT(1)=Ld;
enter referenceType, LT(1)=Ld;
consume [@3,20:22='Ld;',<1>,1:20] rule referenceType
exit referenceType, LT(1)=.super
exit className, LT(1)=.super
exit classDirective, LT(1)=.super
exit statement, LT(1)=.super
enter statement, LT(1)=.super
enter superDirective, LT(1)=.super
consume [@4,24:29='.super',<33>,2:0] rule superDirective
enter superName, LT(1)=Ljava/lang/Object;
enter referenceType, LT(1)=Ljava/lang/Object;
consume [@5,31:48='Ljava/lang/Object;',<1>,2:7] rule referenceType
exit referenceType, LT(1)=.source
exit superName, LT(1)=.source
exit superDirective, LT(1)=.source
exit statement, LT(1)=.source
enter statement, LT(1)=.source
enter sourceDirective, LT(1)=.source
consume [@6,50:56='.source',<32>,3:0] rule sourceDirective
enter sourceName, LT(1)="java-style lambda group"
enter stringLiteral, LT(1)="java-style lambda group"
consume [@7,58:82='"java-style lambda group"',<304>,3:8] rule stringLiteral
exit stringLiteral, LT(1)=.implements
exit sourceName, LT(1)=.implements
exit sourceDirective, LT(1)=.implements
exit statement, LT(1)=.implements
enter statement, LT(1)=.implements
enter implementsDirective, LT(1)=.implements
consume [@9,98:108='.implements',<31>,6:0] rule implementsDirective
enter implementsName, LT(1)=Landroid/content/DialogInterface$OnClickListener;
enter referenceType, LT(1)=Landroid/content/DialogInterface$OnClickListener;
consume [@10,110:158='Landroid/content/DialogInterface$OnClickListener;',<1>,6:12] rule referenceType
exit referenceType, LT(1)=<
exit implementsName, LT(1)=<
exit implementsDirective, LT(1)=<
exit statement, LT(1)=<
exit parse, LT(1)=<
(Observe how parse is the main rule and is actually exited here, even though there are a bunch more tokens in the pipeline)
What I tried:
I tried reimplementing the default error strategy and error listener and added both to both lexer and parser, just to see if any of those breakpoints would get hit. No breakpoints to any and all overridden methods are ever hit (except sometimes reportAttemptingFullContext
).
This is how I added the overrides:
def parseFile(self, filePath):
errorListener = MyErrorListener()
strategy = MyErrorStrategy()
file = FileStream("file.smali")
lexer = SmaliLexer(file)
lexer.removeErrorListeners()
lexer.addErrorListener(errorListener)
lexer.addErrorListener(strategy)
stream = CommonTokenStream(lexer)
parser = SmaliParser(stream)
parser.removeErrorListeners()
parser.addErrorListener(errorListener)
parser.addErrorListener(strategy)
tree = parser.parse()
...
My setup is as follows:
Windows 10 OS
Python 3.7
Antlr4 v4.8 - antlr-4.8-complete.jar
pip-installed runtime: antlr4_python3_runtime-4.8-py3-none-any.whl
I would really appreciate any help on how to make Antlr4 actually take into account the overridden listener and strategy so that I can both report the errors for debugging but also to be able to handle them differently. Thanks!