Perl script to compare two files but print in order

Question

I have followed this question perl compare two file and print the matching lines and found lines which match or dont match between two files using hash.

But I find that hash rearranges the lines and I want the lines in order. I can write multiple for loops to get results in order but this is not as efficient as hash. Has anyone faced this issue before and could please help with their solution

clt60 · Answer 1 · 2013-07-10T22:18:04.773

2

Maybe don't understand fully the question but

fgrep -xf file2 file1

is not enough?

or

fgrep -xf file1 file2

yes, it is not perl but, short simple and fast...

edited Jul 10 '13 at 22:18

answered Jul 10 '13 at 21:58

clt60

62,119
17
107
194

1

That is pretty short and sweet. Agree this should solve the whole problem without the intermediate perl. But there is a risk of partial matches. If you add the `-x` flag you only match whole lines, which is what the OP wanted, I think. It would be interesting to have a speed comparison vs his two-step approach. – Floris Jul 10 '13 at 22:15

Floris · Accepted Answer · 2013-07-10T22:11:11.433

1

This can be done efficiently in two steps. Let's assume you have been able to find the "lines that match" but they are in the wrong order; then a simple grep can re-organize them. Assuming you have a script matchThem that takes two inputs (file1 and file2) and outputs them to tempFile, then the over all script will be:

matchThem file1 file2 > tempFile
grep -Fx -f tempFile file1

The -Fx flag means:

-F : find exact match only (much faster than wildcards)
-x : only match whole lines

edited Jul 10 '13 at 22:11

answered Jul 10 '13 at 21:53

Floris

45,857
6
70
122

I could not get it working with "grep -Fx tempFile file1" but "fgrep -f tempFile file1" does the work .. Thanks @Floris – Raghav Jul 10 '13 at 22:17
I have a short files to compare, so I dont much of a difference and only grep gives the output in order (which I need) .. But I found online that perl hash tables scale better for large log files ..http://stackoverflow.com/questions/11490036/fast-alternative-to-grep-f – Raghav Jul 11 '13 at 04:50
Interesting link - if you ever do run this on large files would you update us on the time difference? – Floris Jul 11 '13 at 11:53

score 1 · Answer 3 · answered Jul 11 '13 at 07:45

1

If you want an hash which keeps the insertion order, then try out the CPAN module Tie::IxHash.

answered Jul 11 '13 at 07:45

Slaven Rezic

4,571
14
12

Perl script to compare two files but print in order

3 Answers3