Objective
I am trying to fill out $9(booking ref), $10 (client) in file1.csv with information pulled from $2 (booking ref) and $3 (client) of file2.csv using "CAMPAIGN ID" ($5 in file1.csv and $1 in file2.csv). So where I have a match between the two files based on "CAMPAIGN ID" I want to print the columns of file2.csv into the matching rows of file1.csv
File1.csv
INVOICE,CLIENT,PLATFORM,CAMPAIGN NAME,CAMPAIGN ID,IMPS,TFS,PRICE,Booking Ref,client
BOB-UK,clientname1,platform_1,campaign1,20572431,5383594,0.05,2692.18,,
BOB-UK,clientname2,platform_1,campaign2,20589101,4932821,0.05,2463.641,,
BOB-UK,clientname1,platform_1,campaign3,23030494,4795549,0.05,2394.777,,
BOB-UK,clientname1,platform_1,campaign4,22973424,5844194,0.05,2925.21,,
BOB-UK,clientname1,platform_1,campaign5,21489000,4251031,0.05,2122.552,,
BOB-UK,clientname1,platform_1,campaign6,23150347,3123945,0.05,1561.197,,
BOB-UK,clientname3,platform_1,campaign7,23194965,2503875,0.05,1254.194,,
BOB-UK,clientname3,platform_1,campaign8,20578983,1522448,0.05,765.1224,,
BOB-UK,clientname3,platform_1,campaign9,22243554,920166,0.05,463.0083,,
BOB-UK,clientname1,platform_1,campaign10,20572149,118865,0.05,52.94325,,
BOB-UK,clientname2,platform_1,campaign11,23077785,28077,0.05,14.40385,,
BOB-UK,clientname2,platform_1,campaign12,21811100,5439,0.05,5.27195,,
File2.csv
CAMPAIGN ID,Booking Ref,client
20572431,ref1,1
21489000,ref2,1
23030494,ref3,1
22973424,ref4,1
23150347,ref5,1
20572149,ref6,1
20578983,ref7,2
22243554,ref8,2
20589101,ref9,3
23077785,ref10,3
21811100,ref11,3
23194965,ref12,3
Desired Output
INVOICE,CLIENT,PLATFORM,CAMPAIGN NAME,CAMPAIGN ID,IMPS,TFS,PRICE,Booking Ref,client
BOB-UK,clientname1,platform_1,campaign1,20572431,5383594,0.05,2692.18,ref1,1
BOB-UK,clientname2,platform_1,campaign2,20589101,4932821,0.05,2463.641,ref9,3
BOB-UK,clientname1,platform_1,campaign3,23030494,4795549,0.05,2394.777,ref3,1
BOB-UK,clientname1,platform_1,campaign4,22973424,5844194,0.05,2925.21,ref4,1
BOB-UK,clientname1,platform_1,campaign5,21489000,4251031,0.05,2122.552,ref2,1
BOB-UK,clientname1,platform_1,campaign6,23150347,3123945,0.05,1561.197,ref5,1
BOB-UK,clientname3,platform_1,campaign7,23194965,2503875,0.05,1254.194,ref12,3
BOB-UK,clientname3,platform_1,campaign8,20578983,1522448,0.05,765.1224,ref7,2
BOB-UK,clientname3,platform_1,campaign9,22243554,920166,0.05,463.0083,ref8,2
BOB-UK,clientname1,platform_1,campaign10,20572149,118865,0.05,52.94325,ref6,1
BOB-UK,clientname2,platform_1,campaign11,23077785,28077,0.05,14.40385,ref10,3
BOB-UK,clientname2,platform_1,campaign12,21811100,5439,0.05,5.27195,ref11,3
What I've tried
From the research I've done on line this appears to be possible using awk
and join
(How to merge two files using AWK? to get me the closest out of what I found online).
I've tried various awk
codes I've found online and I can't seem to get it to achieve my goal. below is the code I've been trying to massage and work that gets me the closes. At current the code is set up to try and populate just the booking ref
as I presume I can just rinse repeat it for the client
column. With this code I was able to get it to populate the booking ref
but it required me to move CAMPAIGN ID
to $1 and all it did was replace the values.
NOTE: The order for file1.csv won't sync with file2.csv. All rows may be in a different order as shown in this example.
current code
awk -F"," -v OFS=',' 'BEGIN { while (getline < "fil2.csv") { f[$1] = $2; } } {print $0, f[$1] }' file1.csv
Can someone confirm where I'm going wrong with this code as I've tried altering the columns in this - and the file - without success? Maybe it's just how I'm understanding the code itself.