To me, it seems like you are simply searching for all matches of each regex. If a regex matches, you color the match. Thus, you overwrite the color of a previous match with the color of the last match.
To solve this issue, you have to use a proper lexer that is able to translate a given input text into a stream of tokens. Then, you can run over the token stream and when you encounter a token that needs to be colored, you can do this.
This prevents the current issue, that one part of the input text is matched by multiple regex and thus colored multiple times. It prevents it, because each character of the input text is associated to exactly one token in the token stream.
A lexer that uses the first longest match algorithm works like this: It searches for all regex matches that start at the beginning of the input text. It chooses the regex that has the longest match. If there are multiple regex that share the longest match, it chooses the first one. Now the lexer creates the first token of the token stream. The token consists of the token type (which is given by the regex), the start position of the match and the end position of the match. Next, the lexer searches for the next token by doing the above again. However this time, it searches for matches that start at the end position of the previous token. The lexer does this until the complete input text is transformed into a token stream, or until it encounters an invalid input.
The important part here is, that the end position of token n
and the start position of token n + 1
is the same. Thus, there is no overlap and thus there is always only one color.