I'm working on lexical analyzing in java world, and want to break a given string into tokens discarding the spaces. I use the below regex to match tokens such as alphabet, numbers and the most common operators and separators:
"[a-zA-Z0-9_]+|[\\[\\](){}.;,!<>+^%]"
However, operators like ++
, --
, ==
,<=
,>=
^=
,*=
,+=
is difficult to handle. Any help in how to improve my regex to fit my needs ? Many thanks.