How to detect beginning of line, or: "The name 'getCharPositionInLine' does not exist in the current context"

Question

I'm trying to create a Beginning-Of-Line token:

lexer grammar ScriptLexer;

BOL : {getCharPositionInLine() == 0;}; // Beginning Of Line token

But the above emits the error

The name 'getCharPositionInLine' does not exist in the current context

As it creates this code:

private void BOL_action(RuleContext _localctx, int actionIndex) {
    switch (actionIndex) {
    case 0: getCharPositionInLine() == 0; break;
    }
}

Where the getCharPositionInLine() method doesn't exist...

Maybe try `GetCharPositionInLine()` (PascalCase as recommended by various C# code guidelines) — knittl, Aug 09 '15 at 12:09
@knittl, tried that. No method with a name that is even similar to that... — Tar, Aug 09 '15 at 12:22
Have a look at the lexer class: https://github.com/antlr/antlr4-csharp/blob/master/runtime/CSharp/Antlr4.Runtime/Lexer.cs There is a `charPositionInLine` in there, but I'm not really familiar with C# to post an answer (hence this comment). — Bart Kiers, Aug 09 '15 at 13:53
@knittl C# has properties in the language, so you won't see many getter functions in C# code :-) The solution here is to use the `Column` property, so `fragment BOL : { Column == 0 } ;` (or `== 1`, dunno) should probably work (I don't think it makes sense to have an empty lexer rule, hence the `fragment`). — Lucas Trzesniewski, Aug 09 '15 at 19:49
@LucasTrzesniewski - that was it. Please post an answer so I can accept it — Tar, Aug 10 '15 at 12:41
If anybody is looking for Typescript property it's `this.charPositionInLine === 0;` where `this` refers to Lexer superclass. — K.Novichikhin, Oct 21 '20 at 17:18

GRosenberg · Accepted Answer · 2015-08-12T23:02:00.197

7

Simplest approach is to just recognize an EOL as the corresponding BOL token.

BC  : '/*' .*? '*/' -> channel(HIDDEN) ;
LC  : '//' ~[\r\n]* -> channel(HIDDEN) ;
HWS : [ \t]*        -> channel(HIDDEN) ;
BOL : [\r\n\f]+ ;

Rules like a block comment rule will consume the EOLs internally, so no problem there. Rules like a line comment will not consume the EOL, so a proper BOL will be emitted for the line immediately following.

A potential problem is that no BOL will be emitted for the beginning of input. Simplest way to handle this is to force prefix the input text with a line terminal before feeding it to the lexer.

edited Aug 12 '15 at 23:02

answered Aug 10 '15 at 00:57

GRosenberg

5,843
2
19
23

Excellent answer, it helped me with a similar question (I got here via https://stackoverflow.com/q/32870858/1112244). I will add that if you don't route `BOL` to a hidden channel, you will have to include it in your parser everywhere you expect to encounter those characters. In my case, I use a separate lexer and parser, and I defined in my lexer the token that had to appear at the beginning of the line (it is a line label). My parser rules are not EOL-delimited otherwise, so I routed `BOL` to a hidden channel in order to avoid adding it as a parser rule. – Peter Nov 07 '17 at 04:06

How to detect beginning of line, or: "The name 'getCharPositionInLine' does not exist in the current context"

1 Answers1

Linked