0

Assume you have a CSV comma delimited file with fields enclosed in quotes and one of the quoted data field(s) contains a comma. How do you count the number of fields in that record?

Here is the example:

$ echo '"abc”,”123”,,”abc,123”' | awk -F ',' '{print NF}'

5

The result is 5, but it should be 4. Using the example above, how do I modify the syntax to count the number of fields?

sietse85
  • 1,488
  • 1
  • 10
  • 26
Gil
  • 1
  • while this is possible to solve as stated, it is also a massive headache and embeds a maintenance nightmare into your project. Why not just just a field separator value that is not embedded in the data. The `|` is often a good choice as it is visible and generally not in user input. Also, this Q has been asked before here (and answered, I'm almost sure), so if you really need that solution, try searching around. Good luck. – shellter Oct 13 '18 at 21:22
  • CSV is in that class of formats (with XML, JSON, YAML) that are, as you have discovered, surprisingly tricky to parse -- just wait until you encounter quotes embedded in quoted fields. Whenever you need to parse CSV, use a CSV parser: [csvkit](https://csvkit.readthedocs.io/en/1.0.3/) is a suite of CSV tools. Ruby and Python ship with CSV libraries. Perl's Text::CSV is very good at handling the edgiest edge cases, including newlines embedded in quoted fields. – glenn jackman Oct 13 '18 at 22:04

0 Answers0