How to use tokenization to do lexical analysis in Java of a program in a file

Asked Feb 11 '22 at 15:35

Active Feb 11 '22 at 15:56

Viewed 167 times

I need to do a lexical analysis of a very simple program in a file as part of a project. It was suggested to me to use tokenization to divide the lexical elements of the program. I've never used this technique and don't know how to implement it. Tokens can be:

a keyword like: IF, WHILE, ADD, SUB, SET, TRUE, FALSE etc ...
a parenthesis (open or closed)
a number
a variable

An example of an inbound program is:

(BLOCK (SET n 10) (SET sum 0) (WHILE (GT n 0) (BLOCK (SET sum (ADD sum n)) (SET n (SUB n 1)) (PRINT sum))))

How do I use tokenization to recognize and divide these program elements?

edited Feb 11 '22 at 15:56

asked Feb 11 '22 at 15:35

Haterproof

See if this helps: https://stackoverflow.com/questions/43067869/lexical-analyser-in-java – pringi Feb 11 '22 at 15:38
[Tokenization](https://en.wikipedia.org/wiki/Lexical_analysis#Tokenization) – David Conrad Feb 11 '22 at 16:27
For this case it looks like a simple FSM that tracks whether it is in a run of alphanumeric characters would suffix. Emit a token for each paren and each such run. – David Conrad Feb 11 '22 at 16:29

How to use tokenization to do lexical analysis in Java of a program in a file

0 Answers0