0

I have a CSV line and I need to count the number of columns in the line.

Some of the column values contains comma (in this case the value will be surround with quotation marks)

I need a Regex that will match only commas that are not surrounded with quotation marks.

For example:

a,b,c

will match 2 commas

and The line:

a,"b,c",d,"e,f" 

will match 3 commas

Thanks,

Nadav.

Golash
  • 23
  • 7

1 Answers1

2

I doubt if a complex regular expression will be better than an easy loop:

private static int CountCommas(String source, Char separator = ',') {
  int result = 0;
  Boolean inQuotation = false;

  foreach (Char c in source)
    if (c == '"')
      inQuotation = !inQuotation;
    else if ((c == separator) && !inQuotation)
      result += 1;

  return result;
}

Test

  // 3
  Console.Write(CountCommas("a, \"b,c\", d, \"e,f\""));
Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
  • Dimitry, thanks for your answer.you are probably right, currently I have a simple loop in my code. I want to see what people can offer (also as part of a learning process) – Golash Feb 29 '16 at 08:09
  • @NadavWolfin: You do not want a regex for this. There is a direct duplicate of your question that lists some regexps that work with some strings, but once you have longer strings, your app might freeze, or will be very slow. Use a CSV parser, or Dmitry's suggestion. – Wiktor Stribiżew Feb 29 '16 at 08:17