Regex.Replace(myJSON, "(\"(?:[^\"\\\\]|\\\\.)*\")|\\s+", "$1")
should do it. It makes sure that strings that contain space characters are preserved, and all other space characters are discarded. All JSON keywords (false
, true
, null
) have to be separated by commas or other punctuation so only white-space inside strings needs to be preserved.
The first option (\"(?:[^\"\\\\]|\\\\.)*\")
matches a double quoted string. The (...)
mean that the output is captured and available in the replacement as $1
. The [^\"\\\\]
matches any character except a double quote or escape character \
.
Since matching occurs left-to-right, the second option, \s+
will not match space inside a string.
So we match whole strings, and spaces outside strings. In the former case, $1
is the string token, and in the latter case $1
is the empty string because group 1 was not used.
This works as intended because
- the only tokens in JSON that can contain spaces are double-quoted strings. There are no single-quoted strings or comments in JSON.
- the JSON grammar requires single-character punctuation between all multi-character tokens, so removing space will not merge tokens. In JavaScript, this could be problematic because space is required to break tokens;
var x=0
is different from varx=0
and x - -(y)
is different from x--(y)
.