20

Example string:

"Foo","Bar, baz","Lorem","Ipsum"

Here we have 4 values in quotes separated by commas.

When I do this:

str.split(',').forEach(…

than that will also split the value "Bar, baz" which I don't want. Is it possible to ignore commas inside quotes with a regular expression?

Šime Vidas
  • 182,163
  • 62
  • 281
  • 385
  • Are your quotes correctly balanced? Can there be escaped quotes within quotes? (Don't you really need a CSV parser?) – Tim Pietzcker May 10 '14 at 14:32
  • 1
    Of course it is possible with a regular expression. – Kai May 10 '14 at 14:32
  • @TimPietzcker Hm, I could go with a CSV parser, if I can load it via ` – Šime Vidas May 10 '14 at 14:35
  • Do you actually need the quotes in the result? From your example, it seems like commas are only present when separating quoted phrases or when separating words within the quoted phrases, so you should be able to do `str.slice(1,-1).split('","')` if it's consistent that way. If there can be spaces around the commas you're splitting on, then you can use a simpler regex `.split(/"\s*,\s*"/)`. And if you need the quotes, then `.map(function(item) { return '"' + item + '"'; })` – cookie monster May 10 '14 at 15:20
  • @cookiemonster Heh, that's a good idea :) – Šime Vidas May 10 '14 at 16:05

1 Answers1

62

One way would be using a Positive Lookahead assertion here.

var str = '"Foo","Bar, baz","Lorem","Ipsum"',
    res = str.split(/,(?=(?:(?:[^"]*"){2})*[^"]*$)/);

console.log(res);  // [ '"Foo"', '"Bar, baz"', '"Lorem"', '"Ipsum"' ]

Regular expression:

,               ','
(?=             look ahead to see if there is:
(?:             group, but do not capture (0 or more times):
(?:             group, but do not capture (2 times):
 [^"]*          any character except: '"' (0 or more times)
 "              '"'
){2}            end of grouping
)*              end of grouping
 [^"]*          any character except: '"' (0 or more times)
$               before an optional \n, and the end of the string
)               end of look-ahead

Or a Negative Lookahead

var str = '"Foo","Bar, baz","Lorem","Ipsum"',
    res = str.split(/,(?![^"]*"(?:(?:[^"]*"){2})*[^"]*$)/);

console.log(res); // [ '"Foo"', '"Bar, baz"', '"Lorem"', '"Ipsum"' ]
thefourtheye
  • 233,700
  • 52
  • 457
  • 497
hwnd
  • 69,796
  • 4
  • 95
  • 132
  • 2
    Here's my attempt in allowing single quotes too! `str.split(/,(?=(?:(?:[^'"]*(?:'|")){2})*[^'"]*$)/)` Let me know if there are bugs. Need to have it correct too! – JohnnyQ Feb 11 '17 at 11:51
  • @hwnd, you sir are a regex wizard, this should be marked as the right answer. – Dan Ochiana May 17 '18 at 14:05
  • Fixed some issues (such as not including the quotes) in this answer: https://stackoverflow.com/a/57121244/2771889 – thisismydesign Jul 20 '19 at 02:12