There are a lot of similar "splitting spaces and quotes" Q&As on SO, most of them with regex solutions. In fact, your code can be found in in at least one of them (thanks for that, try-catch-finally ).
While a few of these solutions exclude the quotes, only one that I could find works if there is no space delimiter following the closing quote, and none of them both exclude quotes and allow for missing spaces.
It is also not just a simple matter of adapting any of the regexes. If you do change the regex to use capturing groups, a simple match
method is no longer possible. (The usual technique around this is to use exec
in a loop.) If you don't use capturing groups you need to do a string manipulation afterwards to remove the quotes.
The neatest solution is to use map
on the array result from the match
.
Using the slice
string manipulation method:
var str = 'this "is a"test string';
var result = str.match(/"[^"]*"|\S+/g).map(m => m.slice(0, 1) === '"'? m.slice(1, -1): m);
console.log(result);
Using capturing groups:
var str = 'this "is a"test string';
var regex = /"([^"]*)"|(\S+)/g;
var result = (str.match(regex) || []).map(m => m.replace(regex, '$1$2'));
console.log(result);
The capturing group solution is the more general one, easily expandable to allow for different quotes, for example.
Note that the regex used in both solutions above is very simple and only works for double quotes, and no escaped quotes in the sub-strings. (It works fine with nested single quotes and apostrophes, though.)
Explanation for the regex:
Note that the order of the two groups is critical. If \S+
is used first it will match the opening quote together with just the first following word.
As for that state machine code you were attempting to use, it is very restrictive and only works for precisely one space between terms, and breaks if there are any apostrophes used anywhere (because it also allows the sub-strings to be single quoted).
It can be fixed to work for your specific example by pushing an empty string when an end quote is detected. To also allow for a single space after a closing quote, there needs to be a check for an existing empty string before pushing a new one:
var str = 'this "is a"test string';
var result = str.match(/\\?.|^$/g).reduce((p, c) => {
if(c === '"' || c === "'"){
if(!(p.quote ^= 1)){p.a.push('');} // <- modified
}else if(!p.quote && c === ' ' && p.a[p.a.length-1] !== ''){ // <- modified
p.a.push('');
}else{
p.a[p.a.length-1] += c.replace(/\\(.)/,"$1");
}
return p;
}, {a: ['']}).a
console.log(result);