I've been playing around with antlr to do a kind of excel formula validation. Antlr looks pretty nice, however, I have some doubts about the way it works.
Imagine I have a grammar that already knows about all kind of tokens needed to perform an excel formula validation (rules references, operations, etc). In this grammar, there is no valid token for currency symbols (€,£, etc), though I have an 'ERROR_CHAR' token that matches anything: ERROR_CHAR: .;
Here's what I want to know about an example input: =€€€+SUM(1,2)
- The formula is not valid
- All the tokens after
€€€
are valid and there are rules for them ->+SUM(1,2)
My parser only knows that €
is invalid, but don't know about a sequence of ERROR_CHAR, just like €€€
, and so, all the input is wrong and all subsequent tokens are caught by the error listener. I assume that this is because, based on my parser rules, I am not saying that ERROR_CHAR could be present anywhere in the input.
I don't want to skip those tokens, because I'd like to highlight the position of the error and I am already skipping whitespaces.
Do you have any idea how could I handle this?