The question originating from https://stackoverflow.com/a/53750697/856090 answer.
We receive an "input" string.
The input string is split into several "commands" by +
s that is by \s+\+\s+
regexp. However while splitting quoted +
(\+
) shall be ignored.
Every command is then split into several "arguments" by whitespace characters, but quoted (\
) whitespace is not counted on splitting and instead becomes a part of an argument.
Quoted \
(that is \\
) becomes regular characted \
and itself is not participated in quoting.
My solution is to process the input string char-by-char with special behavior for \
, +
, and whitespace characters. This is slow and not elegant. I ask for an alternative solution (such as by using regexps).
I write in Python 3.
For example,
filter1 + \
chain -t http://www.w3.org/1999/xhtml -n error + \
transformation filter2 --arg x=y
transformation filter3
becomes
[['filter1'],
['chain', '-t', 'http://www.w3.org/1999/xhtml', '-n', 'error'],
['transformation', 'filter2', '--arg', 'x=y']]
and
a \+ b + c\ d
becomes
[['a', '+', 'b'], ['c d']]