0

I have 2 text files with several lines. I want to delete all lines in file 1 that doesn't have the text in file 2 example:

file1

2345678  sdfsdfsdfsf 10.00 dirfkdkfsdf XP
2345679  sdfsdfsdfsf 10.00 dirfkdkfsdf XP
2345680  sdfsdfsdfsf 10.00 dirfkdkfsdf XP
2345681  sdfsdfsdfsf 10.00 dirfkdkfsdf XP
2345682  sdfsdfsdfsf 10.00 dirfkdkfsdf XP

file2

2345678
2345679

I need to end up with this in file1

2345678  sdfsdfsdfsf 10.00 dirfkdkfsdf XP
2345679  sdfsdfsdfsf 10.00 dirfkdkfsdf XP

I have to do this in a bash script, using sed, awk, whatever. I have tried this but doesn't work

Prints all records in file1

awk 'NR==FNR{a[$0];next} !($0 in a)' file2 file1

Only prints file2

awk 'NR!=FNR{a[$0];next} !($0 in a)' file2 file1
Bjarki Heiðar
  • 3,117
  • 6
  • 27
  • 40

2 Answers2

2

if the files are already sorted by the key, this is the standard solution

$ join file1 file2

2345678 sdfsdfsdfsf 10.00 dirfkdkfsdf XP
2345679 sdfsdfsdfsf 10.00 dirfkdkfsdf XP

can't get simpler than this.

If you want awk solution, this will be it

$ awk 'NR==FNR{a[$1];next} $1 in a' file2 file1

2345678  sdfsdfsdfsf 10.00 dirfkdkfsdf XP
2345679  sdfsdfsdfsf 10.00 dirfkdkfsdf XP
karakfa
  • 66,216
  • 7
  • 41
  • 56
  • Hi, awk works great, but if I wanted to catch the match from file2 in the 3 column of file1 instead of the first column ? – Pedro Caldeira Apr 01 '16 at 18:09
  • that will be a different question with different input/output. For join version you have to specify `-j 3` and for `awk` change `$1`s to `$3` – karakfa Apr 01 '16 at 18:19
0

Why awk? Use grep instead:

grep -f file2 file1
rush
  • 2,484
  • 2
  • 19
  • 31