0

Here is a part of my script:

foreach $i ( @contact_list ) {

    print "$i\n";

    $e = "zcat $file_list2| grep $i";
    print "$e\n";

    $f = qx($e);
    print "$f";                                       
}

$e prints properly but $f gives a blank line even when $file_list2 has a match for $i.

Can anyone tell me why?

Borodin
  • 126,100
  • 9
  • 70
  • 144
  • That edit summary was supposed to read "Please pay attention to the markdown when you add code to your answer". Also, welcome to Stack Overflow. – simbabque Sep 26 '16 at 11:05
  • 1
    What's in the variables? Why are you not using `zgrep`? – tripleee Sep 26 '16 at 11:05
  • If the inputs are big (as the zipped format suggests), getting all the matches in one go would seem like a better approach. – tripleee Sep 26 '16 at 11:05
  • i edited the question, sorry for putting it up so untidily the first time, new to to this portal that's why. I tried zgrep also even that doesn't seem to be working. – anonymous_10 Sep 26 '16 at 11:09
  • Most likely, your problem is with `$i`. The code as presented is vulnerable to fun data interpretation problems. For example, you might have spaces or other shell meta characters in the input that would cause `grep` to misbehave. It's hard to be sure, though, unless you add some input samples to the question. – darch Sep 26 '16 at 15:38

2 Answers2

0

Your question leaves us guessing about many things, but a better overall approach would seem to be opening the file just once, and processing each line in Perl itself.

open(F, "zcat $file_list |") or die "$0: could not zcat: $!\n";
LINE:
while (<F>) {
    ######## FIXME: this could be optimized a great deal still
    foreach my $i (@contact_list) {
        if (m/$i/) {
            print $_;
            next LINE;
        }
    }
}
close (F);

If you want to squeeze out more from the inner loop, compile the regexes from @contact_list into a separate array before the loop, or perhaps combine them into a single regex if all you care about is whether one of them matched. If, on the other hand, you want to print all matches for one pattern only at the end when you know what they are, collect matches into one array per search expression, then loop them and print when you have grepped the whole set of input files.

Your problem is not reproducible without information about what's in $i, but I can guess that it contains some shell metacharacter which causes it to be processed by the shell before the grep runs.

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • @contact_list is an array which has 355k mail ids, i need to check if these mail ids are present in my database which is in a zip file. so i need to check if each one of those 355k mail ids are present of not in the zip file. Also the zip file itself has 4 million records hence i'm trying to avoid opening it and using zcat or zgrep – anonymous_10 Sep 26 '16 at 11:19
  • Yeah, so chances are that looping a search 355k times is going to be lots faster than looping the entire input file 355k times. – tripleee Sep 26 '16 at 11:21
  • This doesn't answer the question and almost certainly doesn't solve the problem. – darch Sep 26 '16 at 15:40
0

Always is better to use Perl's grep instead of using pipe :

@lines = `zcat $file_list2`;    # move output of zcat to array
die('zcat error') if ($?);      # will exit script with error if zcat is problem
# chomp(@lines)                 # this will remove "\n" from each line

foreach $i ( @contact_list ) {

    print "$i\n";

    @ar = grep (/$i/, @lines);
    print @ar;
#   print join("\n",@ar)."\n";      # in case of using chomp
}

Best solution is not calling zcat, but using zlib library : http://perldoc.perl.org/IO/Zlib.html

use IO::Zlib;

# ....
# place your defiiniton of $file_list2 and @contact list here.
# ...

$fh = new IO::Zlib; $fh->open($file_list2, "rb")
    or die("Cannot open $file_list2");
@lines = <$fh>;
$fh->close;

#chomp(@lines);                    #remove "\n" symbols from lines
foreach $i ( @contact_list ) {

    print "$i\n";
    @ar = grep (/$i/, @lines);
    print (@ar);
#   print join("\n",@ar)."\n";    #in case of using chomp
}
dgmrdr
  • 38
  • 3