I use colorama to add ANSI codes to text, I need to split the ANSI color codes from the text so the text can be printed in column formats. The following expression separates a single color code from the text, but not a double code.
# adapted from https://stackoverflow.com/questions/2186919
split_ANSI_escape_sequences = re.compile(r"""
(?P<col>
\x1b # literal ESC
\[ # literal [
[;\d]* # zero or more digits or semicolons
[A-Za-z] # a letter
)*
(?P<text>.*)
""", re.VERBOSE).fullmatch
def split_ANSI(s):
return split_ANSI_escape_sequences(s).groupdict()
This is the result:
>>> split_ANSI('\x1b[31m\x1b[1mtext')
{'col': '\x1b[1m', 'text': 'text'}
It splits correctly, but loses the formatting information. I'm expecting
{'col': '\x1b[31m\x1b[1m', 'text': 'text'}
as the result.
How can I get all the potential escape sequences in the first group?