3

I want to split a string, any string, into an array by spaces, preferably with the split() method. However, I wish to ignore spaces in quotation marks.

Take, for example:

'word "words in double quotes"'

It should become an array with:

[
  'word',
  'words in double quotes'
]

I looked at similar answers to this, and they usually gave an array with:

[
  'word',
  '"words in double quotes"'
]

and that isn't what I'm looking for. I don't want the quotation marks added into the array element.

What regular expression could I use?

kerabite
  • 99
  • 6
  • 2
    it's impossible with `split` alone as `split` can't remove the final quotation mark without, well introducing another split. Try e.g. `const [match, first, second] = string.match(/^(.*) "(.*)"$/)` – le_m Jan 08 '18 at 00:12
  • maybe `'word "words in double quotes"'.split(/"(.*?)"|\s+/g).filter(Boolean)`, but depends on how quotes within the quotes are escaped – Slai Jan 08 '18 at 01:08
  • 1
    Does this answer your question? [Javascript split string on space or on quotes to array](https://stackoverflow.com/questions/2817646/javascript-split-string-on-space-or-on-quotes-to-array) – ggorlen Oct 15 '20 at 17:48

3 Answers3

2

I don't think what you want can be achieved through the use of String.prototype.split alone, because its use will most likely lead to empty strings in the resulting array; and that's about the string you gave. If you need a general solution to your problem I believe split won't work at all.

If your goal is to produce the same result irrespective of the actual string, I'd suggest you use a combination of String.prototype.match, [].map and String.prototype.replace as shown:

Code:

var
  /* The string. */
  string = 'apples bananas "apples and bananas" pears "apples and bananas and pears"',

  /* The regular expression. */
  regex = /"[^"]+"|[^\s]+/g,

  /* Use 'map' and 'replace' to discard the surrounding quotation marks. */
  result = string.match(regex).map(e => e.replace(/"(.+)"/, "$1"));
  
console.log(result);

Explanation of the regex used:

  • "[^"]+": Capture any sequence of characters (at least 1) inside two quotation marks except a quotation mark.
  • |: Logical OR.
  • [^\s]+: Capture any sequence of non-whitespace characters (at least 1).
  • g: The global flag - instruction to match all occurrences.
Angel Politis
  • 10,955
  • 14
  • 48
  • 66
  • Sorry, I meant for the `word "words with double quotes"` string to act as an example. The regular expression should work with any string, such as `command arg1 "arg 2" arg3` or `apples bananas "apples and bananas"` I fixed the wording in the post to clear up any confusion. – kerabite Jan 08 '18 at 03:23
  • Check out my answer once more @insertcodehere. I have updated it to match any given string. I hope this helps – Angel Politis Jan 08 '18 at 07:19
  • Thanks for all your help! @Angel Politis – kerabite Jan 08 '18 at 18:41
  • Also, while this is an adequate answer, would it be possible to remove the `map()` method and just somehow implement it into the regular expression? – kerabite Jan 08 '18 at 18:44
0

I hope this is what you're looking for:

var words = 'word "words in double quotes" more text "stuff in quotes"';
var wordArray = words.match(/"([^"]+)"|[^" ]+/g);
for(var i=0,l=wordArray.length; i<l; i++){
  wordArray[i] = wordArray[i].replace(/^"|"$/g, '');
}
console.log(wordArray);
StackSlave
  • 10,613
  • 2
  • 18
  • 35
  • not sure why this was downvoted, it works fine. I also did a jsbench and this one is faster too. this should be the accepted answer, not the one by Angel Politis – moeiscool Oct 05 '20 at 23:04
0
  1. split initial string by "
  2. split each odd item in result of #1 by space

using regexp dramatically affects readability and maintainability of your code. especially when your are trying to make workaround about existing limitations(say, lacking look behind).

skyboyer
  • 22,209
  • 7
  • 57
  • 64