2

Please help me on this question.
How to perform this below operation (arithmetic calculations) using Shell Scripting.

sample1.csv

Assets,3,4,5
Loans,4,5,6
reported_activity,7,8,10

sample2.csv

credit,6,7,9
debit,13,5,8
affilate,9,4,5

take all first column values from sample1.csv

3
4
7

& take first column values from sample2.csv file

6
13
9

then calculate using formula :

Assets + ( loans + affiliate *-1) + reported_activity + (credit-debit)
                         

     3 +( 4*-1)+7 (6-13)
                              =3-4+7-7
                              = -1
                      

similarly, need to perform arithmetic calculations for col2 & col3 of both the .csv files.

Simon Doppler
  • 1,918
  • 8
  • 26
blackhole
  • 214
  • 1
  • 11
  • Your first column values look more like the second column... – Shawn Jun 15 '21 at 09:44
  • Anyways, start with reading http://mywiki.wooledge.org/BashFAQ/001 and http://mywiki.wooledge.org/ArithmeticExpression – Shawn Jun 15 '21 at 09:46

2 Answers2

2

You can transpose the csv files and process them with awk. You can modify the formula in the print statement accordingly.

paste -d" " <(rs -c',' -C -T < sample1) <(rs -c',' -C -T < sample2) | awk 'NR>1 {print $1+$2-$6+$3+$4-$5}'

Alternate approach without using paste command

cat sample1 sample2 | rs -c, -C -T | awk 'NR>1 {print $1+$2-$6+$3+$4-$5}'

Paste command will join both the files and you'll have a final file with all the columns.


Since rs is available only in BSD systems, below snippet uses awk to transpose the input. Borrowing the transpose code from this SO answer

BEGIN { FS=OFS="," }
{
    for (rowNr=1;rowNr<=NF;rowNr++) {
        cell[rowNr,NR] = $rowNr
    }
    maxRows = (NF > maxRows ? NF : maxRows)
    maxCols = NR
}
END {
    for (rowNr=1;rowNr<=maxRows;rowNr++) {
        for (colNr=1;colNr<=maxCols;colNr++) {
            printf "%s%s", cell[rowNr,colNr], (colNr < maxCols ? OFS : ORS)
        }
    }
}

Save the above file as transpose.awk and run as below

awk -f transpose.awk sample1.csv sample2.csv | awk -F, 'NR>1 {print $1+$2-$6+$3+$4-$5}'
BarathVutukuri
  • 1,265
  • 11
  • 23
1

Adding to what Shawn said, I think your calculation is wrong.

Substituting the second column values in this part of the formula (loans + affiliate *-1) we have (4 + 9 * -1) which is -5 (or, if you don't care about the order of operations, it could be -13, but never -4

That said, this code does the trick:

cat sample*.csv > samples.csv

IFS=',' read -r -a Assets <<< "$(grep Assets samples.csv)"
IFS=',' read -r -a Loans <<< "$(grep Loans samples.csv)"
IFS=',' read -r -a reported_activity <<< "$(grep reported_activity samples.csv)"
IFS=',' read -r -a credit <<< "$(grep credit samples.csv)"
IFS=',' read -r -a debit <<< "$(grep debit samples.csv)"
IFS=',' read -r -a affilate <<< "$(grep affilate samples.csv)"

for ((i = 1 ; i < ${#Assets[@]} ; i++)); do
        echo -n "Column $i: "
        echo "$(( ${Assets[i]} + ${Loans[i]} - ${affilate[i]} + ${reported_activity[i]} + ${credit[i]} - ${debit[i]} ))"
done

rm -f samples.csv
$ bash samples.sh
Column 1: -2
Column 2: 15
Column 3: 17
  • the sample1 and sample2 csv files contain floating numbers in real scenario While executing the above script, I am getting error : syntax error: invalid arithmetic operator (error token is ".387) – blackhole Jun 15 '21 at 16:28
  • 1
    @blackhole bash doesn't do floating point math, just integer. See http://mywiki.wooledge.org/BashFAQ/022 – Shawn Jun 15 '21 at 18:40