2

I'm trying to split a string that contains consecutive commas as well as a comma wrapped in quotation marks but can't quite get the result that I want. Here is an example of a string I have:

var str = ‘10,Apple,"Sweet Gala apple, from Australia",,,,,,in stock,3.99’

where the third element has a comma inside quotation marks, followed by multiple commas.

I want to split the string by commas under two conditions: 1) don't split the comma that's wrapped inside quotations marks 2) between the multiple commas that are next to each other should be treated as a blank space

When I use the regex below:

str.match(/(".*?"|[^,]+)/g)

The result comes out to the array below which meets the first condition but fails to insert a blank space between the consecutive commas

["10","Apple",""Sweet Gala apple, from Australia"","in stock","3.99"]

I want it to look like:

["10","Apple",""Sweet Gala apple, from Australia"",'','','','','',"in stock","3.99"]  

What do I need to do to meet the above two conditions?

Drive2blue
  • 127
  • 1
  • 10

2 Answers2

5

The main problem here is that you want zero-length matches under certain conditions, but the engine will always try to get a zero-length match no matter whether the last item matched ends at the same index (like 10, will match 10, and if the pattern permits an empty match, then will try to match the empty string between 10 and ,). A plain global match alone won't be able to differentiate between that and the ,,,,,, situation.

I'd use split instead, rather than match - split on a comma, and negative lookahead for non-" characters, followed by ",, to ensure that the comma matched was not within a "" sequence:

var str = '10,Apple,"Sweet Gala apple, from Australia",,,,,,in stock,3.99';
const result = str.split(/,(?![^"]*",)/);
console.log(result);

If the ""s may come at the very end, then at the end of the negative lookahead, alternate the , with $:

var str = '10,Apple,"Sweet Gala apple, from Australia",,,,,,in stock,3.99,"foo, bar"';
const result = str.split(/,(?![^"]*"(?:,|$))/);
console.log(result);
CertainPerformance
  • 356,069
  • 52
  • 309
  • 320
0

All I had to do was change your regex to use * instead of + like so:

str.match(/(".*?"|[^,]*)/g)

weexpectedTHIS
  • 3,358
  • 1
  • 25
  • 30
  • This results in `["10", "", "Apple", "", ""Sweet Gala apple, from Australia"", "", "", "", "", "", "", "in stock", "", "3.99", ""]`, which is not the desired output - the problem is, like I said in my answer, the engine won't be picky about matching zero-length matches when it can. – CertainPerformance Aug 09 '19 at 21:31