bash - Find names in file1 according to IDs in file2

Question

Sorry if this is duplicate, but I searched and haven't found exactly same question. So I have

File1:
Aaron ID00456
Brad ID00123
Cassie ID00789
Doug ID12345
Ethan ID05555

File2:
ID12345
ID00123
ID00456

Keeping the order of IDs in File2, I'd like to have output File3 as:
Doug ID12345
Brad ID00123
Aaron ID00456

Welcome to Stack Overflow. Please read the [About] and [Ask] pages soon. Which platform are you working on? Linux or something else? Are there any restrictions on which tools to use? Is Awk allowed? Perl? How big are the files going to be? As big as shown, or multiple hundreds of lines in File2 and many thousands in File1, or bigger than that? How crucial is the output order? Could there be entries in File2 that don't match anything in File1? What should happen then? — Jonathan Leffler, Dec 01 '18 at 00:27
What have you tried? Even using a simple `while read` loop with grep will give you results. [How can I read a file line by line](https://mywiki.wooledge.org/BashFAQ/001). [Examples using grep](http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_04_02.html). Probably the fastest and best way is to use awk. — KamilCuk, Dec 01 '18 at 00:30
https://stackoverflow.com/questions/42239179/fastest-way-to-find-lines-of-a-file-from-another-larger-file-in-bash — George Vasiliou, Dec 01 '18 at 00:42

score 1 · Answer 1 · answered Dec 01 '18 at 00:57

Try this script (suppose File1.txt and File2.txt are in the same directory of the script).

#!/bin/bash
while read -r ID2
do
  while read -r NAME ID1
  do
    if [ "$ID1" = "$ID2" ]
    then
      echo $NAME $ID1 >> File3.txt
    fi
  done < File1.txt
done < File2.txt

Then find File3.txt in the same directory with the content:

Doug ID12345
Brad ID00123
Aaron ID00456

score 1 · Answer 2 · answered Dec 01 '18 at 02:28

1

awk to the rescue!

$ awk 'NR==FNR {a[$2]=$1; next} 
               {print a[$1],$1}' file1 file2

Doug ID12345
Brad ID00123
Aaron ID00456

answered Dec 01 '18 at 02:28

karakfa

66,216
7
41
56

Thanks a lot. This worked for me even without the newline. As I have about 100,000 records (lines) to process, this is much faster than the while loop or grep. – yunshi11 Dec 01 '18 at 03:52
Step 1: Read the 1st field (names) of file1 into array a[] with the 2nd field (IDs) being its index in the array. Step 2: Print the array elements according to the index from the 1st field (IDs), followed by the 1st field (IDs) themselves. Am I understanding this right? Didn't know you could use a string as an index number in an array. – yunshi11 Dec 01 '18 at 03:57

bash - Find names in file1 according to IDs in file2

2 Answers2