I have a simple language which consists of patterns like
size(50*50)
start(10, 20, -x)
forward(15)
stop
It's an example of turtle-drawing language. I need to properly tokenize it. The above is a source code instance. Statements and expressions are separated with newlines. I set up my Scanner to use delimiters like newlines. I expect next("start")
to eat the string "start", and then I issue next("(")
to eat the first parenthesis. It appears however, that it does something else than I expect. Has the scanner already broken the above into tokens based on delimiter and/or do I need to approach this differently? For me, "start", "(", "50", "*", "50" and ")" on the first line would constitute separate tokens, which appears to be an unfulfilled expectation here. How can I tokenize the above with as little code as possible? I don't currently need to write a tokenizer, I am writing an interpreter, so tokenizing is something I don't want to spend my time on currently, I just like Scanner to work with me here.
My useDelimiter
call is as follows:
Scanner s ///...
s.useDelimiter(Pattern.compile("[\\s]&&[^\\r\\n]"));
Issuing first next
call gives me the entire file contents. Without the above call, it gives me entire first line.