Bash - replace string in tab seperated file using lookup table

Question

I have a main.txt file where I wish to replace the Sector_ID within coloumn 2 with a Sector_name, IF a mapping exist in a lookup table. Both the main.txt and the lookup.txt are tab seperated.

main.txt:

Serving_Sector  Target_Sector   HO_Attempts HO_Successful_Attempts
1112928 1112929 2   2
1112928 1112930 0   0
1112929 1112928 3   3

lookup.txt:

Sector_id  Sector_name
1112929 SectorTEST

Any clue on how to solve this using bash? In some cases the Sector_id might not be in the lookup table. In such cases it should keep original value in main.txt

Proposed script(by @Dielna Reboot):

#!/bin/bash

#put only ids in variable
ids="$(cat hostats.txt | awk '{print $2}' | grep -v Sector)"

for sector_name in $ids; do

#match id condition
grep "$sector_name" lookup.txt >/dev/null 2>&1 && {
 #save sector name
 sector_id="$(grep "$sector_name" lookup.txt | awk '{print $2}')"
 # replace via sed in-place
  sed -i "s/$sector_name/$sector_id/g" hostats.txt
} || true

done

Result is this("->" illustrates tab):

Serving_Sector->Target_Sector->HO_Attempts_HO->Successful_Attempts
1112928->SectorTEST
->2->2
1112928->1112930->0->0
SectorTEST
->1112928->3->3

For some reason a new line is appended, and also any matches will update the coloumn 1 (Serving_sector) which in this case is not desired.

The desired result should be this("->" illustrates tab):

Serving_Sector->Target_Sector->HO_Attempts->HO_Successful_Attempts
1112928->SectorTEST->2->2
1112928->1112930->0->0
1112929->1112928->3->3

Dielna Reboot · Answer 1 · 2022-11-25T07:28:57.903

1

Yo, tested it and have a solution for you, gl & hf

#!/bin/bash

#put only ids in variable
ids="$(cat main.txt | awk '{print $2}' | grep -v Sector)"

for sector_id in $ids; do

#match id condition 
grep "$sector_id" lookup.txt >/dev/null 2>&1 && {
 #save sector name
 sector_name="$(grep "$sector_id" lookup.txt | awk '{print $2}')"
 # replace via sed in-place
  sed -i "s/$sector_id/$sector_name/g" main.txt
} || true

done

I have tried it on your sample input files, after execution the main.txt will look like this:

Serving Sector  Target Sector   HO Attempts HO Successful Attempts
1002080 Sector_B 8   8
1002080 Sector_C 0   0
1002080 Sector_D 2   2
1002080 2104-2975   5   5
1002080 Sector_F 2   2
1002080 1012237 10  10
1002080 1012281 0   0

edited Nov 25 '22 at 07:28

answered Nov 25 '22 at 07:21

Dielna Reboot

53
7

1

Repeatedly running `sed -i` on the same file can be very inefficient. You'll want to combine all the changes into a single `sed` script and manipulate the file only once. But then `sed` is probably the wrong tool for this; the standard Awk solution is less code, more obvious, and more elegant. – tripleee Nov 25 '22 at 07:55
Yes, that is right, but unfortunately I have not mastered the awk at this scope yet. – Dielna Reboot Nov 25 '22 at 11:39
Instead of ```#!/bin/bash```, ```#!/usr/bin/env bash``` or ```#!/bin/env bash``` is more reliable way. – Dielna Reboot Nov 25 '22 at 11:41
@DielnaReboot, thank you for your suggestion. I edited the headline of main.txt to be "_" instead of space to be more readable. I also corrected the lookup.txt headline that mixed up id vs name. I slightly modified your script to correct this "headline mistakes". I also updated the main description. Please have a look, for some reason I am not getting the correct result here, any clue why? – SHR Nov 29 '22 at 08:02
```sed -i "s/$sector_id/$sector_name/g" main.txt``` replaces the id for the name, it does not look in which column. It just replaces specific number with the name. So instead of using ```sed``` on ```main.txt``` use it on temporary file or some variable which will store only second column, and then replace the whole second column via ```awk``` – Dielna Reboot Nov 30 '22 at 13:08

Bash - replace string in tab seperated file using lookup table

1 Answers1