I'm starting in ANTLR4, what I would want is to recognize this format while doing some action according to the Token read. what I'm trying to produce:
IDENTIFIER:Test1 ([a-zA-Z09]{10})
{insert 'Test1' in personId column}
CODE: F0101F
- FULL_NAME: FIRST_NAME ( [A-Z]+)LAST_NAME ( [A-Z]+ )
{insert FIRST_NAME.value in firstName column and insert LAST_NAME.value in lastName column}
- ADRESS: DIGIT+ STREET_NAME ([A-Z]+)
{insert STREET_NAME.value in streetName column }
- OTHER_INFORMATION: ([A-Z]+)
{insert OTHER_INFORMATION.value in other column}
What I did:
prod
:
read_information+
;
read_information
:
{getCurrentToken().getType()== ID }?
idElement
|
{getCurrentToken().getType()== CODE }?
codeElement
|
{getCurrentToken().getType()== FULLNAME}?
fullNameElement
|
{getCurrentToken().getType()== STREET}?
streetElement
|
{getCurrentToken().getType()== OTHER}?
otherElement
;
codeElement
:
CODE
{getCurrentToken().getText().matches("[A-F0-9]{6}")}?
codeInformation
|
{/*throw someException*/}
;
codeInformation
:
HEXCODE
;
HEXCODE
:
[a-fA-F0-9]+
;
CODE
:
'CODE:'
;
otherElement
:
OTHER otherInformation
;
otherInformation
:
STR
;
OTHER
:
'OTHER:'
;
streetElement
:
STREET streetInformation
;
STREET
:
'STREET:'
;
streetInformation
:
STR
;
STR
:
[a-zA-Z0-9]+
;
WORD
:
[a-zA-Z]+
;
fullNameElement
:
FULLNAME firstNameInformation lastNameInformation
;
FULLNAME
:
'FULL_NAME:'
;
firstNameInformation
:
WORD
;
lastNameInformation
:
WORD
;
idElement
:
ID idInformation
;
ID
:
'ID:'
;
idInformation
:
{getCurrentToken().getText().length()<=10}?
STR
;
I'm not sure If this is the right approach since I have problems reading WORD token. Since all the tokens are basically of the same format, I'm trying to find a way to keep track of the precedent token or context to resolve the ambiguity, and check the format at the same time ( example if it's more than 10 char throw exception )