I have two different files that I want to partially match one number from the first one with another number from the other one and extract the whole column.
File1:
smt_hsa_3150 932
smt_hsa_28592 682
smt_hsa_5184 657
smt_hsa_430 648
smt_hsa_14100 648
smt_hsa_96 648
File2:
chr11 5933549 5933577 29 + hsa_smt_028592
chr11 45693060 45693086 27 - hsa_smt_000059
chr11 45699803 45699832 30 - hsa_smt_000087
chr2 131291172 131291197 26 - hsa_smt_000096
I need to match smt_hsa_28592 or 28592 with hsa_smt_028592 or 028592. and then extract to a new file the line from the second file plus the number from 2nd column of the 1st file.
output:
chr11 5933549 5933577 29 + hsa_smt_028592 682
chr2 131291172 131291197 26 - hsa_smt_000096 648
As I'm new to awk/sed programming I tried first to change the name of the first column of the 1st file from smt_hsa_3150 to hsa_smt_3150, but when I perform
awk '{gsub("smt","hsa")}1'
then, I cannot use the same code to change only the second "hsa". The second problem is how I would be able to match hsa_smt_028592 with smt_has_28592 or smt_hsa_96 with hsa_smt_000096.