45

I need help splitting a string in javascript by space (" "), ignoring space inside quotes expression.

I have this string:

var str = 'Time:"Last 7 Days" Time:"Last 30 Days"';

I would expect my string to be split to 2:

['Time:"Last 7 Days"', 'Time:"Last 30 Days"']

but my code splits to 4:

['Time:', '"Last 7 Days"', 'Time:', '"Last 30 Days"']

this is my code:

str.match(/(".*?"|[^"\s]+)(?=\s*|\s*$)/g);

Thanks!

Andreas Louv
  • 46,145
  • 13
  • 104
  • 123
Elad Kolberg
  • 489
  • 1
  • 4
  • 10
  • While the linked question is _related_, it is _not_ a duplicate: _This_ question explicitly wants unquoted strings that directly adjoin double-quoted strings (e.g., `foo:"bar none"`) to be recognized as a _single_ token (and also doesn't mention the need to handle escaped double-quotes.) – mklement0 Oct 09 '15 at 15:30

3 Answers3

92
s = 'Time:"Last 7 Days" Time:"Last 30 Days"'
s.match(/(?:[^\s"]+|"[^"]*")+/g) 

// -> ['Time:"Last 7 Days"', 'Time:"Last 30 Days"']

Explained:

(?:         # non-capturing group
  [^\s"]+   # anything that's not a space or a double-quote
  |         #   or…
  "         # opening double-quote
    [^"]*   # …followed by zero or more chacacters that are not a double-quote
  "         # …closing double-quote
)+          # each match is one or more of the things described in the group

Turns out, to fix your original expression, you just need to add a + on the group:

str.match(/(".*?"|[^"\s]+)+(?=\s*|\s*$)/g)
#                         ^ here.
kch
  • 77,385
  • 46
  • 136
  • 148
  • 1
    This would be a good answer if you explained the regular expression. – T.J. Crowder Apr 28 '13 at 09:58
  • just getting it out there first. – kch Apr 28 '13 at 10:00
  • is there a way to exclude the double quotes by adjusting the regex? i.e. output `['Time:Last 7 Days', 'Time:Last 30 Days']` – Awalias Apr 30 '13 at 14:20
  • @Awalias not with just using the `match` method. But if you `replace` afterwards, you can use capture groups etc and leave the quotes out. Probably better for you to post a new question. – kch May 02 '13 at 09:47
  • Is there a way to handle spaces after colon too. e.g var str = 'Time: "Last 7 Days" Time: "Last 30 Days"'; => ['Time: "Last 7 Days"', 'Time: "Last 30 Days"'] – rdp May 13 '16 at 05:01
  • 4
    Modified version of kch's pattern that works with single or double quotes: `s.match(/(?:[^\s"']+|['"][^'"]*["'])+/g) ` – jobrad Mar 24 '17 at 17:16
  • 1
    What if the quotes are escaped, without using negative lookaheads (due to compatibility)? – user10398534 Sep 14 '20 at 22:19
  • Not sure if this is just a false positive or not, but this snippet was flagged as "Inefficient regular expression: This part of the regular expression may cause exponential backtracking on strings containing many repetitions of '!'." by CodeQL on GitHub. Issue about it is here: https://github.com/github/codeql/issues/5964 – Glenn 'devalias' Grant May 28 '21 at 03:54
5

ES6 solution supporting:

  • Split by space except for inside quotes
  • Removing quotes but not for backslash escaped quotes
  • Escaped quote become quote

Code:

str.match(/\\?.|^$/g).reduce((p, c) => {
        if(c === '"'){
            p.quote ^= 1;
        }else if(!p.quote && c === ' '){
            p.a.push('');
        }else{
            p.a[p.a.length-1] += c.replace(/\\(.)/,"$1");
        }
        return  p;
    }, {a: ['']}).a

Output:

[ 'Time:Last 7 Days', 'Time:Last 30 Days' ]
Tsuneo Yoshioka
  • 7,504
  • 4
  • 36
  • 32
0

This Works for me..

var myString = 'foo bar "sdkgyu sdkjbh zkdjv" baz "qux quux" skduy "zsk"'; console.log(myString.split(/([^\s"]+|"[^"]*")+/g));

Output: Array ["", "foo", " ", "bar", " ", ""sdkgyu sdkjbh zkdjv"", " ", "baz", " ", ""qux quux"", " ", "skduy", " ", ""zsk"", ""]

pady
  • 259
  • 2
  • 4