3

Currently, I can split a string like this:

"1 2 3".split(' ') // [ "1", "2", "3" ]
"1 2 3 'word'".split(' ') // [ "1", "2", "3", "'word'" ]

Is there a way to avoid splitting on a space within a nested string?

For example:

"1 2 3 'word one'".split(' ') // want output of [ "1", "2", "3", "'word one'" ]
"1 2 3 \"word one\"".split(' ') // want output of [ "1", "2", "3", "\"word one\"" ]

I want output of [ "1", "2", "3", "'word one'" ] instead of [ "1", "2", "3", "'word", "one'" ] (i.e. I want to ignore spaces if they are in strings).

Kwoppy
  • 289
  • 3
  • 10

3 Answers3

6

One approach can be to use match with a regex that accounts for the spaces inside quotes:

var s = "1 2 3 \"word one\" one \"two\" 'hello world'";

console.log(s.match(/'[^']+'|"[^"]+"|\w+/g));

Edit: See Certain Performance's answer for a better regex.

slider
  • 12,810
  • 1
  • 26
  • 42
  • 2
    This will match too much if the string contains more quote characters later, eg `1 "2" "3"`. `\w` also assumes word characters, which may well not be the case. – CertainPerformance Nov 29 '18 at 00:55
5

To correctly match strings containing additional quote characters, when matching substrings in quotes, lazy-repeat the . with .+?, otherwise strings such as

1 "2" "3"

won't match properly. Also, unless you can count on all matches containing just word characters, probably better to use \S (which will match anything but whitespace characters):

var s = `1 "2" "3" foo'bar`
console.log(s.match(/'.+?'|".+?"|\S+/g));

Or, to be slightly more efficient, rather than lazy repetition, use negative character classes instead:

var s = `1 "2" "3" foo'bar`
console.log(s.match(/'[^']+'|"[^"]+"|\S+/g));
CertainPerformance
  • 356,069
  • 52
  • 309
  • 320
-1

Walk through the string and keep a Boolean flag for if you're within quotes.

if(string[i] == ' ' && !insideQuotes) //split