I have two CSV files, with ; (semicolon) as a separator, that I need to merge based on three columns on each file using AWK. The key columns are not consecutive. Idea is to get two columns from file B and print them after all the other columns from file A.
File A (keys are in A1, A3, and A5):
A1;A2;A3;A4;A5
K1;D1;K2;D2;K3
K4;D3;K5;D4;K6
K7;D5;K8;D6;K9
K1;D7;K2;D8;K3
File B (keys in B1, B2, B4):
B1;B2;B3;B4;B5
K1;K2;D9;K3;D0
K4;K5;DA;K6;DB
KA;KB;DC;KC;DD
Would produce:
A1;A2;A3;A4;A5;;
K1;D1;K2;D2;K3;D9;D0
K4;D3;K5;D4;K6;DA;DB
K7;D5;K8;D6;K9;;
K1;D7;K2;D8;K3;D9;D0
I have found several examples here in SO (for example How to merge two files based on the first three columns using awk and How to merge two files using AWK?) and elsewhere but I haven't been able to convert them to my needs, as they haven't been documented so well that an AWK n00b like myself would really understand how they work.
Closest I've gotten is:
awk -F \; -v OFS=\; 'FNR==NR{c[$1]=$3 FS $5;next}{ print $0, c[$1]}' B A
But it still leaves out one semicolon--or a column--from output lines 1 and 4:
A1;A2;A3;A4;A5;
K1;D1;K2;D2;K3;D9;D0
K4;D3;K5;D4;K6;DA;DB
K7;D5;K8;D6;K9;
K1;D7;K2;D8;K3;D9;D0
An how do I state which columns I want to use for comparing? Apparently now it's only using first column for comparing.