5

I have a string like following

var str="A,B,C,E,'F,G,bb',H,'I9,I8',J,K"

I'd like to split the string on commas. However, in the case where something is inside single quotation marks, I need it to both ignore commas as following.

 A
 B
 C
 E
 F,G,bb
 H
 I9,I8
 J
 K
ROMANIA_engineer
  • 54,432
  • 29
  • 203
  • 199
Augustian Joseph
  • 107
  • 1
  • 1
  • 12
  • 1
    possible duplicate of [Javascript code to parse CSV data](http://stackoverflow.com/questions/1293147/javascript-code-to-parse-csv-data) – stema May 16 '12 at 11:25

3 Answers3

12
> str.match(/('[^']+'|[^,]+)/g)
["A", "B", "C", "E", "'F,G,bb'", "H", "'I9,I8'", "J", "K"]

Though you requested this, you may not accounted for corner-cases where for example:

  • 'bob\'s' is a string where ' is escaped
  • a,',c
  • a,,b
  • a,b,
  • ,a,b
  • a,b,'
  • ',a,b
  • ',a,b,c,'

Some of the above are handled correctly by this; others are not. I highly recommend that people use a library that has thought this through, to avoid things such as security vulnerabilities or subtle bugs, now or in the future (if you expand your code, or if other people use it).


Explanation of the RegEx:

  • ('[^']+'|[^,]+) - means match either '[^']+' or [^,]+
  • '[^']+' means quote...one-or-more non-quotes...quote.
  • [^,]+ means one-or-more non-commas

Note: by consuming the quoted string before the unquoted string, we make the parsing of the unquoted string case easier.

mplungjan
  • 169,008
  • 28
  • 173
  • 236
ninjagecko
  • 88,546
  • 24
  • 137
  • 145
  • Thank you, it works for me .The cases u mention not affect me – Augustian Joseph May 16 '12 at 11:51
  • @gdoron: `('[^']+'|[^,]+)` - means "match either `'[^']+'` or `[^,]+`". `'[^']+'` means "quote...one-or-more non-quotes...quote". `[^,]+` means "one-or-more non-commas". – ninjagecko May 16 '12 at 11:57
  • @gdoron: Also by consuming the quoted string before the unquoted string, we make parsing the unquoted string case easier. – ninjagecko May 16 '12 at 12:04
6

Here is my version that works with single and double quotes and can have multiple quoted strings with commas embedded. It gives empty results and too many of them, so you have to check for that. Not rigorously tested. Please excuse the over use of '\'.

var sample='this=that, \
sometext with quoted ",", \
for example, \
another \'with some, quoted text, and more\',\
last,\
but "" "," "asdf,asdf" not "fff\',\'  fff" the least';

var it=sample.match(/([^\"\',]*((\'[^\']*\')*||(\"[^\"]*\")*))+/gm);
for (var x=0;x<it.length;x++) {
var txt=$.trim(it[x]);
if(txt.length)
    console.log(">"+txt+'<');
}​
aflin
  • 61
  • 1
0

Use this

            var input="A,B,C,E,'F,G,bb',H,'I9,I8',J,K";
            //Below pattern will not consider comma(,) between ''. So 'I9,I8' will be considered as single string and not spitted by comma(,). 
            var pattern = ",(?=([^\']*\'[^\']*\')*[^\']*$)";
            //you will get acctual output in array
            var output[] = input.split(pattern);
Akash
  • 587
  • 5
  • 12
  • 1
    While this code snippet may solve the question, including an explanation [really helps](//meta.stackexchange.com/q/114762) to improve the quality of your post. Remember that you are answering the question for readers in the future, not just the person asking now! Please [edit] your answer to add explanation, and give an indication of what limitations and assumptions apply. – Toby Speight Nov 11 '16 at 13:32