comparing files and replacing data in shell script

Question

I want to match column 1 of file.txt with column 2 of test.txt to replace column 2 of test.txt with corresponding column 1,2 and 3 from files.txt. Can someone suggest best approach for this using a bash script?

Below are 2 files: test.txt

pcs_err 102 0 1580917083
too_long 103 0 1580917083
emc_out 103 0 1580917083
too_long 104 0 1580917083
emc_out 104 0 158091708
link_failt 104 0 1580917083
loss_sig 104 0 1580917083

file.txt

102 1 10 0efd40 N16 Online F-Port 52:4a 
103 1 11 0e5e00 N16 Online F-Port 20:01
104 2 0 0e2200 N16 Online F-Port 20:01
105 2 1 0e5700 N16 Online F-Port 20:01

Desired output would look something like below,

pcs_err.1.10.102 0 1580917083
too_long.1 11.103 0 1580917083
emc_out.1.11.103 0 1580917083
too_long.2.0.104 0 1580917083
emc_out.2.0.104 0 158091708
link_failt.2.0.104 0 1580917083
loss_sig.2.0.104 0 1580917083

Please post the output you want to achieve. Use `join` with `-o`. — KamilCuk, Feb 05 '20 at 17:45
Welcome to SO. On SO we do encourage users to add their efforts which they have put in order to solve their own problems, so please do add the same in your Question. Also your sample expected output is not clear too which you want to get it. — RavinderSingh13, Feb 05 '20 at 17:46

ghoti · Answer 1 · 2020-02-05T18:13:18.857

1

So here's a quick one-liner...

awk 'NR==FNR{a[$1]=$2"."$3"."$1;next} {$1=$1"."a[$2];$2=""} 1' file.txt test.txt

Or for easier reading (and commenting)...

NR==FNR {             #  For the first file specified...
  a[$1]=$2"."$3"."$1  #  store an array with the first field as a key.
  next
}
{                     #  For the second+ file specified...
  $1=$1"."a[$2]       #  append contents of the array to the first field,
  $2=""               #  empty the now-redundant second field,
}
1                     #  and print.

This has a slight spacing issue because when we empty $2 we leave the delimiters around it. If this is a problem, you could get around that by replacing this section and your print command with a slightly more complicated printf:

NR==FNR {
  a[$1]=$2"."$3"."$1
  next
}
{
  printf "%s.%s %s %s%s", $1, a[$2], $3, $4, ORS
}

The benefit here is that you get finer-grained control over your output format. The risk is that if your input format changes, your code may be less resilient. YMMV.

edited Feb 05 '20 at 18:13

answered Feb 05 '20 at 18:01

ghoti

45,319
8
65
104

thanks, this works like a charm. I was wondering if i can use an if statement in awk to remove entries that would not exists from a[1] in a[2] instead of just leaving it blank. I was going to use sed/awk with if statement to clean if that's not possible was just curious. – sri Feb 06 '20 at 03:00
@sri .. absolutely, you can use an `if`. You *almost never* need to pipe through multiple text processing tools, if one of them is awk. Note that you could also use a condition on the command statement that depends on the array, such as `$2 in a {`. You can `man awk` for more details on syntax. And next time, remember to include such examples and an explanation of their handling in the sample input in your question (along with an attempt at solving this yourself). – ghoti Feb 06 '20 at 04:21
thanks. I was get the desired result by using another awk statement in the next step of the script and below is what i used and also this helped me to get rid of that extra space from print in awk statement. awk '$1 ~/[0-9]$/ { print $1,$2,$3 }' new.txt – sri Feb 06 '20 at 05:31

comparing files and replacing data in shell script

1 Answers1