2

I have this script which is compare 2 files and print out the diff result. now I want to change the script instead of print out the diff lines, i want to print the matching lines. and also to count how many time matched every time running the script. would you please any one can give me a suggestion. thanks!

#! /usr/local/bin/perl 
# compare 
my $f1 = "/opt/test.txt";
my $f2 = "/opt/test1.txt";
my $outfile = "/opt/final_result.txt";
my %results = (); 
open FILE1, "$f1" or die "Could not open file: $! \n";
while(my $line = <FILE1>){   $results{$line}=1;
}
close(FILE1); 
open FILE2, "$f2" or die "Could not open file: $! \n";
while(my $line =<FILE2>) {  
$results{$line}++;
}
close(FILE2);  
open (OUTFILE, ">$outfile") or die "Cannot open $outfile for writing \n";
foreach my $line (keys %results) { print OUTFILE $line if $results{$line} == 1;
}
close OUTFILE;
eli
  • 35
  • 1
  • 2
  • 5

3 Answers3

2
print OUTFILE $line if $results{$line} == 1;

This will print lines that occur only one time.

print OUTFILE $line if $results{$line} > 1;

One small change (== to >), and it will now print lines that occur more than one time. That should print identical duplicate lines.

Oh, also if you want the count, simply do:

if ( $results{$line} > 1 ) {
    print OUTFILE "$results{$line}: ", $line;
}

I wrote a more concise and more flexible version here. It takes optional filenames and prints to STDOUT.

You can put 0 in place of one of the names to compare one of the files against another. Use shell redirection to save it to a file.

Usage:

$ script.pl file1.txt file2.txt > outfile.txt

Code:

use strict;
use warnings;
use autodie;

my $f1 = shift || "/opt/test.txt";
my $f2 = shift || "/opt/test1.txt";
my %results;
open my $file1, '<', $f1;
while (my $line = <$file1>) { $results{$line} = 1 }
open my $file2, '<', $f2;
while (my $line = <$file2>) { $results{$line}++ }
foreach my $line (sort { $results{$b} <=> $results{$a} } keys %results) {
    print "$results{$line}: ", $line if $results{$line} > 1;
}
TLP
  • 66,756
  • 10
  • 92
  • 149
  • thank you so very much! my primary objective is met. my secondary objection is the device matched should be removed from the file. so I want the counter to tell me how many times they are matched. example. the script run every week so the counter adding 1 number every run. so let say after 4 weeks if I see '4' next to the line that means the device out there 4 weeks and if the second line match 3 times that means the device there for 3 weeks and so on. simply my objective is to know for how many weeks each devices are matched. – eli Jul 28 '11 at 16:58
  • 1
    I'm not quite sure what you are asking here and how it is different from what you already have. It is generally speaking better to ask for all the objectives at once here at StackOverflow, rather than trying to puzzle together the solution piece by piece. I think what you are asking for requires a new question, preferably with some sample input/output. – TLP Jul 28 '11 at 17:05
  • the counter on you solution show how many items are matched my objective is the counter for how long it's matched. the counter on your solution show me "2" even though I did run the script 10 times. my expectation was to show me "10" since the script run 10 times and matched the current list. sorry about the confusion but this was my original objective I am not added new objective. also English is my 3rd language so take that for consideration! – eli Jul 28 '11 at 17:15
1

This isn't the cleanest way to do things... but the hard work has been done. Reverse the logic to make it print everything unless $results{$line} == 1, or if $results{$line} != 1.

To add the count:

print OUTFILE "Count: $results{$line} - $line" if $results{$line} != 1;

Alternatively, you could filter out the unwanted with a grep, avoiding the if condition totally:

foreach my $line ( grep { $results{$_} != 1 } keys %results ) {

    print OUTFILE "Count: $results{$line} - $line";
}
Zaid
  • 36,680
  • 16
  • 86
  • 155
  • thanks a lot your answer too full fill my primary objective but not the second one. I think I wasn't clear enough. sorry about that.I want the counter to tell me how many times they are matched. example. the script run every week so the counter adding 1 number every run. so let say after 4 weeks if I see '4' next to the line that means the device out there 4 weeks and if the second line match 3 times that means the device there for 3 weeks and so on. simply my objective is to know for how many weeks each devices are matched. – eli Jul 28 '11 at 17:00
1

Try Test::Differences. See here for code sample and how the output would look like:

http://metacpan.org/pod/Test::Differences

szabgab
  • 6,202
  • 11
  • 50
  • 64
knb
  • 9,138
  • 4
  • 58
  • 85