I was reading this extremely useful question: How to use shell variables in an awk script
But I can't get it working as I want to in my script. A Part of the core of the script is the following:
#!/bin/bash
# VARS #
XLSX=$1
SHEET=$2
P_PATH="/tmp/test"
EXTRACTCSV="$P_PATH/extract.csv"
TEMPCSV="$P_PATH/out_column.csv"
COLS=4
T_COLS=$(($COLS +1))
COUNTER=2010
# START #
[ -d $P_PATH ] || mkdir $P_PATH
rm -rf $P_PATH/*
ssconvert --export-type=Gnumeric_stf:stf_assistant -O 'sheet='$SHEET' separator=; format=automatic eol=unix' ${XLSX} ${EXTRACTCSV} >/dev/null 2>&1
until [ ${COUNTER} -eq 2018 ]; do
echo "year: $COUNTER"
col1=$((${COLS} - 3))
col2=$((${COLS} - 2))
col3=$((${COLS} - 1))
col4=${COLS}
echo "col1: $col1 - col2: $col2 - col3: $col3 - col4: $col4"
echo "tail -n +3 ${EXTRACTCSV} | awk -F';' -v var1="$col1" -v var2="$col2" -v var3="$col3" -v var4="$col4" 'BEGIN {print $var1";"$var2";"$var3";"$var4}' | sed 's/\./\,/g'" #> ${TEMPCSV}"."${COUNTER} # Skip first 2 rows and working on the current column only
COLS=$(($COLS + $T_COLS))
let COUNTER+=1
done
Part of EXTRACTCSV:
"DAY START";"DATA1";"PERCENTAGE";"DATA2";;"DAY START";"DATA1";"PERCENTAGE";"DATA2";;"DAY START";"DATA1";"PERCENTAGE";"DATA2";;"DAY START";"DATA1";"PERCENTAGE";"DATA2";;"DAY START";"DATA1";"PERCENTAGE";"DATA2";;"DAY START";"DATA1";"PERCENTAGE";"DATA2";;"DAY START";"DATA1";"PERCENTAGE";"DATA2";;"DAY START";"DATA1";"PERCENTAGE";"DATA2"
2010/01/01;76.1838;0.9497;73.4787;;2011-01-01;400.6334;0.7429;539.2905;;2012/01/01;478.4495;0.8154;586.7632;;2013-01-01;514.3688;0.7406;116.4914;;2014-01-01;602.3154;0.7031;856.6758;;2015-01-01;709.1598;0.7457;950.9946;;2016-01-01;696.2204;0.7006;993.735;;2017/01/01;666.3247;0.6373;1045.5864
2010/01/02;75.9292;0.9465;73.4787;;2011-01-02;398.9055;0.7397;539.2903;;2012-01-02;478.3616;0.8153;586.7631;;2013-01-02;511.6213;0.7367;116.4996;;2014-01-02;601.0048;0.7016;856.6742;;2015-01-02;704.6024;0.7409;950.9953;;2016-01-02;692.7325;0.6971;993.8134;;2017/01/02;659.8746;0.6311;1045.5821
2010/01/03;75.5417;0.9417;73.4787;;2011-01-03;396.1734;0.7346;539.29;;2012-01-03;476.7519;0.8125;586.7638;;2013-01-03;508.9712;0.7329;116.4981;;2014-01-03;599.4631;0.6998;856.6302;;2015-01-03;700.7211;0.7368;950.9943;;2016-01-03;688.9282;0.6932;993.9072;;2017/01/03;652.2328;0.6238;1045.5821
2010/01/04;74.9889;0.9348;73.4787;;2011-01-04;392.8654;0.7285;539.2906;;2012-01-04;475.0166;0.7396;586.7636;;2013-01-04;506.9215;0.7299;116.4955;;2014-01-04;598.3154;0.6985;856.5889;;2015-01-04;696.8216;0.7327;950.9946;;2016-01-04;683.3585;0.6875;993.9379;;2017/01/04;644.9272;0.6168;1045.6002
2010/01/05;74.3354;0.9267;73.4787;;2011-01-05;389.7434;0.7227;539.2906;;2012-01-05;473.383;0.7368;586.7636;;2013-01-05;505.1373;0.7273;116.4953;;2014-01-05;597.1538;0.6971;856.6002;;2015-01-05;691.2058;0.7268;950.9937;;2016-01-05;677.7324;0.6819;993.9349;;2017/01/05;636.1736;0.6084;1045.6002
2010/01/06;73.5865;0.9173;73.4787;;2011-01-06;388.1988;0.7177;540.9071;;2012-01-06;471.874;0.7342;586.7635;;2013-01-06;503.1365;0.7245;116.4973;;2014-01-06;595.5167;0.6952;856.57;;2015-01-06;685.5321;0.7209;950.9938;;2016-01-06;672.7798;0.6769;993.9423;;2017/01/06;627.0485;0.5997;1045.6002
2010/01/07;72.7931;0.9075;73.4787;;2011-01-07;386.287;0.7141;540.9069;;2012-01-07;470.588;0.732;586.7626;;2013-01-07;500.267;0.7202;116.5772;;2014-01-07;593.5777;0.6929;856.6254;;2015-01-07;679.5659;0.7146;950.9971;;2016-01-07;667.4732;0.6716;993.9342;;2017/01/07;619.7181;0.5927;1045.6002
2010/01/08;71.9871;0.8974;73.4787;;2011-01-08;385.0015;0.7118;540.9071;;2012-01-08;469.3661;0.7999;586.7626;;2013-01-08;497.3401;0.716;116.5887;;2014-01-08;591.4693;0.6904;856.6439;;2015-01-08;674.2712;0.709;950.9948;;2016-01-08;662.5751;0.6666;993.9427;;2017/01/08;612.5235;0.5858;1045.6002
With this command:
tail -n +3 ${EXTRACTCSV} | awk -F';' -v var1="$col1" -v var2="$col2" -v var3="$col3" -v var4="$col4" 'BEGIN {print $var1";"$var2";"$var3";"$var4}' | sed 's/\./\,/g'" #> ${TEMPCSV}"."${COUNTER}
I wish to extract the first 4 columns from EXTRACTCSV
file, for 8 times.
First time for the year 2010, the second for the 2011 and so on until today.
Expected output
:
year 2010 cicle #1
col1 col2 col3 col4
2010/01/01;76.1838;0.9497;73.4787
2010/01/02;75.9292;0.9465;73.4787
2010/01/03;75.5417;0.9417;73.4787
2010/01/04;74.9889;0.9348;73.4787
2010/01/05;74.3354;0.9267;73.4787
2010/01/06;73.5865;0.9173;73.4787
2010/01/07;72.7931;0.9075;73.4787
2010/01/08;71.9871;0.8974;73.4787
year 2011 cicle #2
col6 col7 col8 col9
2011-01-01;400.6334;0.7429;539.2905
2011-01-02;398.9055;0.7397;539.2903
2011-01-03;396.1734;0.7346;539.29
2011-01-04;392.8654;0.7285;539.2906
2011-01-05;389.7434;0.7227;539.2906
2011-01-06;388.1988;0.7177;540.9071
2011-01-07;386.287;0.7141;540.9069
2011-01-08;385.0015;0.7118;540.9071
...And so on for the other years. Can someone tell me what is wrong with how I am using awk?
EDIT: Update for Inian
user@local:~/excel$ ./test2.sh in.xlsx YR_2010_2017
year: 2010
col1: 1 - col2: 2 - col3: 3 - col4: 4
tail -n +3 /tmp/test/extract.csv | awk -v var1=1 -v var2=2 -v var3=3 -v var4=4 'BEGIN{FS=OFS=
./test2.sh: line 28: }{print ,,,}' | sed 's/\./\,/g': No such file or directory
year: 2011
col1: 6 - col2: 7 - col3: 8 - col4: 9
tail -n +3 /tmp/test/extract.csv | awk -v var1=6 -v var2=7 -v var3=8 -v var4=9 'BEGIN{FS=OFS=
./test2.sh: line 28: }{print ,,,}' | sed 's/\./\,/g': No such file or directory
year: 2012
The other rows are the same..