2

There are 2 files. I need to sort them first and then compare the 2 files and then the difference I need to print the value from File 1 and File 2.

file1:

pair,bid,ask
AED/MYR,3.918000,3.918000
AED/SGD,3.918000,3.918000
AUD/CAD,3.918000,3.918000

file2:

pair,bid,ask
AUD/CAD,3.918000,3.918000
AUD/CNY,3.918000,3.918000
AED/MYR,4.918000,4.918000

Output should be:

pair,inputbid,inputask,outputbid,outtputask
AED/MYR,3.918000,3.918000,4.918000,4.918000

The only difference in 2 files is AED/MYR with different bid/ask rates. How can I print difference value from file 1 and file 2.

I tried using below commands:

nawk -F, 'NR==FNR{a[$1]=$4;a[$2]=$5;next} !($4 in a) || !($5 in a) {print $1 FS a[$1] FS a[$2] FS $4 FS $5}' file1 file2

Result output as below:

pair,bid,ask,bid,ask
AUD/CAD,3.918000,3.918000,3.918000,3.918000
AUD/CHF,3.918000,3.918000,3.918000,3.918000
AUD/CNH,3.918000,3.918000,3.918000,3.918000
AUD/CNY,3.918000,3.918000,3.918000,3.918000
AED/MYR,3.918000,3.918000,4.918000,4.918000

We are still not able to get only the difference.

James Z
  • 12,209
  • 10
  • 24
  • 44
  • 1
    Welcome to SO. On SO users are highly encouraged to add their efforts which they have put in order to solve their own problems in their question, so kindly do add the same in your question and let us know then. – RavinderSingh13 Sep 11 '20 at 06:45
  • 2
    Hi Ravinder, Updated the same. Please check. – Siddy_21111990 Sep 11 '20 at 07:09

2 Answers2

2

Could you please try following, written and tested in GNU awk with shown samples.

awk -v header="pair,inputbid,inputask,outputbid,outtputask" '
BEGIN{
  FS=OFS=","
}
FNR==NR{
  arr[$1]=$0
  next
}
($1 in arr) && arr[$1]!=$0{
  val=$1
  $1=""
  sub(/^,/,"")
  if(!found){
    print header
    found=1
  }
  print arr[val],$0
}'  Input_file1  Input_file2

Explanation: Adding detailed explanation for above.

awk -v header="pair,inputbid,inputask,outputbid,outtputask" '  ##Starting awk program from here and setting this to header value here.
BEGIN{                                                         ##Starting BEGIN section of this program from here.
  FS=OFS=","                                                   ##Setting field separator and output field separator as comma here.
}
FNR==NR{                                                       ##Checking condition FNR==NR which will be TRUE when Input_file1 is being read.
  arr[$1]=$0                                                   ##Creating arr with index $1 and keep value as current line.
  next                                                         ##next will skip all further statements from here.
}
($1 in arr) && arr[$1]!=$0{                                    ##Checking condition if first field is present in arr and its value NOT equal to $0
  val=$1                                                       ##Creating val which has current line value in it.
  $1=""                                                        ##Nullifying irst field here.
  sub(/^,/,"")                                                 ##Substitute starting , with NULL here.
  if(!found){                                                  ##Checking if found is NULL then do following.
    print header                                               ##Printing header here only once.
    found=1                                                    ##Setting found here.
  }
  print arr[val],$0                                            ##Printing arr with index of val and current line here.
}' Input_file1  Input_file2                                    ##Mentioning Input_files here.
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
1

With bash process substitution, then join and then choosing with awk:

# print header
printf "%s\n" "pair,inputbid,inputask,outputbid,outtputask"
# remove first line from both files, then sort them on first field
# then join them on first field and output first 5 fields
join -t, -11 -21 -o1.1,1.2,1.3,2.2,2.3 <(tail -n +2 file1 | sort -t, -k1) <(tail -n +2 file2 | sort -t, -k1) |
# output only those lines, that columns differ
awk -F, '$2 != $4 || $3 != $5' 
KamilCuk
  • 120,984
  • 8
  • 59
  • 111