How can all keywords in a grammar be set to accept upper and lower case in an existing grammar?

Question

I have a grammar that works, except the keywords must be upper case. Is there a way to shotgun all the keywords such that lower case equivalents will not be rejected? If not, how do I affect each of them individually?

https://gist.github.com/sharwell/9424666 ∆ https://stackoverflow.com/questions/1844562/how-to-match-a-string-but-case-insensitively ∆ both links come from a googke search for `antlr case insensitive keywords` (2nd and 3rd hit, respectively) — rici, Dec 14 '17 at 18:25
gist.github.com/sharwell/9424666 uses deprecated code not compatible with ANTLR4 runtime 4.7. — nicomp, Dec 14 '17 at 19:19
if we're lucky [@Sam Harwell](https://stackoverflow.com/users/138304/sam-harwell) will come by and fix it :) — rici, Dec 14 '17 at 20:48

Mike Lischke · Answer 1 · 2017-12-17T19:03:02.913

2

I don't recommend input streams that convert case to make the keyword recognition case-insensitive. Such a stream will convert everything, strings, comments etc. even though that is a total waste of CPU cycles. A better approach is to tell explicitly in your grammar that you want (only) certain keywords to be case sensitive. The grammar is trivial:

fragment A: [aA];
fragment B: [bB];
...
fragment Z: [zZ];

KEYWORD1: K E Y W O R D '1';
...

The ATN for these rules is only marginally more complex (using 2 intervals instead of one for each letter, which is (in total) faster than a case conversion):

and as example the letter S:

Each node is a step the ATN simulator has to walk to parse a rule. Edge labels are symbols to match to allow this transition (with ɛ being the epsilon transition, i.e. an unconditional step without input consumption).

edited Dec 17 '17 at 19:03

answered Dec 15 '17 at 08:27

Mike Lischke

48,925
16
119
181

`using 2 intervals instead of one for each letter, which is much faster than a case conversion`. Did you make performance benchmarks for approving? – Ivan Kochurkin Dec 16 '17 at 13:44
No, because I think it's obvious. The actual interval test and a case lookup are probably not very different. But that's not all. Then you have to add all the case conversions such a special stream does and that can sum up. However, for small grammars we are probably talking about milliseconds here, still. Now when you consider also the obscurity to make a grammar work by requiring it to have a special input stream and showing no indication that it is case-insensitive, makes enough points for me to believe that the stream approach is the worse idea. Why using the inferior solution? – Mike Lischke Dec 16 '17 at 15:43
Ok, I'm going to make performance tests by myself in order to make a final conclusion. – Ivan Kochurkin Dec 17 '17 at 12:09
Keep in mind it's not only about performance. I hoped to make that clear with my comment. – Mike Lischke Dec 17 '17 at 12:53
Maybe later we'll revert back fragment rules in SQL grammars (T-SQL, PL/SQL, MySql). – Ivan Kochurkin Dec 17 '17 at 13:12

How can all keywords in a grammar be set to accept upper and lower case in an existing grammar?

1 Answers1