The code provided reads a CSV file and prints the count of all strings found in descending order. However, I would like to know how to specify what fields I would like to read in count...for example
./example-awk.awk 1,2 file.csv
would read strings from fields 1 and 2 and print the counts
#!/bin/awk -f
BEGIN {
FIELDS = ARGV[1];
delete ARGV[1];
FS = ", *"
}
{
for(i = 1; i <= NF; i++)
if(FNR != 1)
data[++data_index] = $i
}
END {
produce_numbers(data)
PROCINFO["sorted_in"] = "@val_num_desc"
for(i in freq)
printf "%s\t%d\n", i, freq[i]
}
function produce_numbers(sortedarray)
{
n = asort(sortedarray)
for(i = 1 ; i <= n; i++)
{
freq[sortedarray[i]]++
}
return
}
This is currently the code I am working with, ARGV[1] will of course be the specified fields. I am unsure how to go about storing this value to use it.
For example ./example-awk.awk 1,2 simple.csv
with simple.csv
containing
A,B,C,A
B,D,C,A
C,D,A,B
D,C,A,A
Should result in
D 3
C 2
B 2
A 1
Because it only counts strings in fields 1 and 2