I have almost no experience using awk. I would like to output all records in a csv that have a certain number of columns, but I need to define a column not by delimiter, but by pattern. The command:
"c:\Program Files\Git\usr\bin\awk" -F, 'NF==13' TestInput2.csv
Works to return all records in csv with exactly 13 columns. This, however, only works if there are no column values containing a comma. It does not work for quoted columns where "a,b","c" is two columns, not three.
I know there is FPAT like FPAT='([^,]+)|("[^"]+")'
I need to combine the FPAT and NF to work like the F and NF above, but I cannot figure out how to write a single line awk that works. It would be something like:
"c:\Program Files\Git\usr\bin\awk" FPAT='([^,]+)|("[^"]+")' 'NF==13' TestInput2.csv
However, that does not work.
Per Request in Comment:
Input
"aa","bb","cc,dd","ee","ff,gg""
"aa","bb","cc","dd","ee"
"aa","bb","cc","dd"
Desired output:
"aa","bb","cc,dd","ee","ff,gg""
"aa","bb","cc","dd","ee"
The output is the first two rows because they both have 5 columns The third row is not returned because it only has 4 columns
If I had rows with 1,2,3,4,6,7,... columns they would also not be returned. Notice that the commas in the quotes are not counted as rows which is why row 1 is considered to have five columns.
I hope that helps describe the problem more accurately.