3

I'm trying to write a loop in Bash that prints the sum of every column in a file. These columns are separated by tabs. What I have so far is this:

cols() {
  count=$(grep -c $'\t' $1)
  for n in $(seq 1 $count) ;do
    cat $FILE | awk '{sum+=$1} END{print "sum=",sum}'
  done
}

But this only prints out the sum of the first column. How can I do this for every column?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
123
  • 8,733
  • 14
  • 57
  • 99
  • As an aside: Pasting your code at [shellcheck.net](http://shellcheck.net) will give you tips on improving your code. – mklement0 Jan 19 '17 at 22:30
  • Possible duplicate of [how to sum each column in a file using bash](http://stackoverflow.com/questions/14956264/how-to-sum-each-column-in-a-file-using-bash) – Benjamin W. Jan 19 '17 at 22:31

2 Answers2

3

Your approach does the job, but it is somehow overkill: you are counting the number of columns, then catting the file and calling awk, while awk alone can do all of it:

awk -F"\t" '{for(i=1; i<=NF; i++) sum[i]+=$i} END {for (i in sum) print i, sum[i]}' file

This takes advantage of NF that stores the number of fields a line has (which is what you were doing with count=$(grep -c $'\t' $1)). Then, it is just a matter of looping through the fields and sum to every element on the array, where sum[i] contains the sum for the column i. Finally, it loops through the result and writes its values.

Why isn't your approach suming a given column? Because when you say:

for n in $(seq 1 $count) ;do
    cat $FILE | awk '{sum+=$1} END{print "sum=",sum}'
done

You are always using $1 as the element to sum. Instead, you should pass the value $n to awk by using something like:

awk -v col="$n" '{sum+=$col} END{print "sum=",sum}' $FILE # no need to cat $FILE
fedorqui
  • 275,237
  • 103
  • 548
  • 598
0

If you want a bash builtin only solution, this would work:

declare -i i l
declare -ai la sa=()
while read -d$'\t' -ra la; do
    for ((l=${#la[@]}, i=0; i<l; sa[i]+=la[i], ++i)); do :; done
done < file
(IFS=$'\t'; echo "${sa[*]}")

The performance of this should be decent, but quite a bit slower than something like awk.

Jeffrey Cash
  • 1,023
  • 6
  • 12