2

I'm building this generic parser that decodes a string to an Array using an specified delimiter.

  • For this question, I'll use comma as delimiter.

This is my current regex:

var reg = /(\,|\r?\n|\r|^)(?:\"([^\"]*(?:\"\"[^\"]*)*)\"|([^"\,\r\n]*))/gi

It works fine for most cases like:

'a,b,c,d'.match(reg);

returns ["a", ",b", ",c", ",d"] (having the commas with the values is not a problem)

When I have empty values, it also works, for example:

'a,,c,'.match(reg);

returns ["a", ",", ",c", ","] (this is also fine)

The problem is when I have a blank value at the first position:

',b,c,d'.match(reg);

returns [",b", ",c", ",d"] and I was expecting something like: ["", ",b", ",c", ",d"]

Any ideas?

Guilherme Lopes
  • 4,688
  • 1
  • 19
  • 42

3 Answers3

2

If you want to split by , then the regex is very simple: /,/g.

You can then pass this pattern into the split function.

It will also work with multi-character delimiters e.g. foo.

You can then do something like this:

var pattern = /,/g;
var el = document.getElementById('out');

el.insertAdjacentHTML('beforeend', '<p>Trying with ,</p>');

output('a,b,c,d');
output(',b,c,d');
output(',,,d');
output('a,,c,');

el.insertAdjacentHTML('beforeend', '<p>Trying with foo</p>');
var pattern = /foo/g;

output('afoobfoocfood');
output('foobfoocfood');
output('foofoofood');
output('afoofoocfoo');

function output(input) {
  var item = '<p>' + input + ' gives: ';
  var arr = input.split(pattern); 
  item += '<pre>' + JSON.stringify(arr) + '</pre></p>';
  el.insertAdjacentHTML('beforeend', item);
}
<div id="out"></div>
Robin Mackenzie
  • 18,801
  • 7
  • 38
  • 56
1

How about something simpler like this regex:

[^\,]*\,(?!$)|[^\,]|\,

The regex above will catch anything between , including special characters. You can build on it to make it match specific type of characters.

This is a working js:

var reg = /[^\,]*\,(?!$)|[^\,]|\,/gi;
var s = ',,b,c,d'.match(reg);  
document.write(s[0], '<br>' , s[1] , '<br>' , s[2] , '<br>' , s[3], '<br>' , s[4]);
Ibrahim
  • 6,006
  • 3
  • 39
  • 50
  • This is a complicated scenario, for example, if I try `',,c,d'` with this regex it will only show one of the empty fields. :/ – Guilherme Lopes Dec 10 '16 at 08:38
  • Thanks for the help Ibrahim, but that had problems when I changed the delimiter to something else and added quotes to the string. That's why the initial RegEx was complex. I ended up using the solution from here: http://stackoverflow.com/questions/1293147/javascript-code-to-parse-csv-data – Guilherme Lopes Dec 12 '16 at 20:30
1

Thanks to everyone who posted an answer but I ended up going with the solution provided here:

Javascript code to parse CSV data

The solution above also had the problem with an empty value at the first position but solving that with JS in the while loop was easier than fixing the RegEx.

Community
  • 1
  • 1
Guilherme Lopes
  • 4,688
  • 1
  • 19
  • 42