Yes, I know regex is not the "right" answer, but it is what the question asked for and I like a good regex challenge.
NOTE: Though the solution below can likely be adapted for other regex engines, using it as-is will require that your regex engine treats multiple named capture groups using the same name as one single capture group. (.NET does this by default)
When multiple lines/records of a CSV file/stream (matching RFC standard 4180) are passed to the regular expression below it will return a match for each non-empty line/record. Each match will contain a capture group named Value
that contains the captured values in that line/record (and potentially an OpenValue
capture group if there was an open quote at the end of the line/record).
Here's the commented pattern (test it on Regexstorm.net):
(?<=\r|\n|^)(?!\r|\n|$) // Records start at the beginning of line (line must not be empty)
(?: // Group for each value and a following comma or end of line (EOL) - required for quantifier (+?)
(?: // Group for matching one of the value formats before a comma or EOL
"(?<Value>(?:[^"]|"")*)"| // Quoted value -or-
(?<Value>(?!")[^,\r\n]+)| // Unquoted value -or-
"(?<OpenValue>(?:[^"]|"")*)(?=\r|\n|$)| // Open ended quoted value -or-
(?<Value>) // Empty value before comma (before EOL is excluded by "+?" quantifier later)
)
(?:,|(?=\r|\n|$)) // The value format matched must be followed by a comma or EOL
)+? // Quantifier to match one or more values (non-greedy/as few as possible to prevent infinite empty values)
(?:(?<=,)(?<Value>))? // If the group of values above ended in a comma then add an empty value to the group of matched values
(?:\r\n|\r|\n|$) // Records end at EOL
Here's the raw pattern without all the comments or whitespace.
(?<=\r|\n|^)(?!\r|\n|$)(?:(?:"(?<Value>(?:[^"]|"")*)"|(?<Value>(?!")[^,\r\n]+)|"(?<OpenValue>(?:[^"]|"")*)(?=\r|\n|$)|(?<Value>))(?:,|(?=\r|\n|$)))+?(?:(?<=,)(?<Value>))?(?:\r\n|\r|\n|$)
[Here is a visualization from Debuggex.com][3] (capture groups named for clarity):
![Debuggex.com visualization][4]
Examples on how to use the regex pattern can be found on my answer to a similar question here, or on C# pad here, or here.