I'm new to Java and have to compile lexer from the arithmetic exression for further calculation. I found useful code to check, but there is one problem while analizing string. When i have something like: 5-9.3 it determines as:
NUMBER 5
NUMBER -9.3
instead of
NUMBER 5
ADDSUBSTR -
NUMBER 9.3
Don't know how to fix it. Would appreciate your help. Want to solve it because it would be harder to cope with it while calculating. And sorry for my awful typesetting.
public class Lexer {
public static enum TokenType {
NUMBER("[-+]?[0-9]*\\.?[0-9]+([eE][-+]?[0-9]+)?"), ADDSUBSTR("[+|-]"), DIVMULT ("[/|*]"), WHITESPACE("[ \t\f\r\n]+");
public final String pattern;
private TokenType(String pattern) {
this.pattern = pattern;
}
}
public static class Token {
public TokenType token;
public String data;
public Token (TokenType token, String data) {
this.token = token;
this.data = data;
}
@Override
public String toString() {
return String.format("%s %s", token.name(), data);
}
}
public static ArrayList<Token> lexer(String s) {
ArrayList<Token> tokens = new ArrayList<>();
StringBuffer tokenBuffer = new StringBuffer();
for (TokenType tokenType : TokenType.values())
tokenBuffer.append(String.format("|(?<%s>%s)", tokenType.name(), tokenType.pattern));
Pattern tokenPatterns = Pattern.compile(new String(tokenBuffer.substring(1)));
Matcher matcher = tokenPatterns.matcher(s);
while (matcher.find()) {
if (matcher.group(TokenType.NUMBER.name()) != null) {
tokens.add(new Token(TokenType.NUMBER, matcher.group(TokenType.NUMBER.name())));
}
else if (matcher.group(TokenType.ADDSUBSTR.name()) != null) {
tokens.add(new Token(TokenType.ADDSUBSTR, matcher.group(TokenType.ADDSUBSTR.name())));
}
else if (matcher.group(TokenType.DIVMULT.name()) != null) {
tokens.add(new Token(TokenType.DIVMULT, matcher.group(TokenType.DIVMULT.name())));
}
else if (matcher.group(TokenType.WHITESPACE.name()) != null);
}
return tokens;
}
public static void main(String[] args) {
try (Scanner in = new Scanner(System.in)) {
String input = in.nextLine();
ArrayList<Token> tokens = lexer(input);
for (Token token : tokens)
System.out.println(token);
}
}
}