0

I'm writing a code to find the file which not contain a string pattern. Provided I have a list of files, I have to look into the content of each file, I would like to get the file name if the string pattern "clean" not appear inside the file. Pls help.

Here is the scenario: I have a list of files, inside each file is having numerous of lines. If the file is clean, it will have the "clean" wording. But if the file is dirty, the "clean" wording not exist and there will be no clear indication to tell the file is dirty. So as long as inside each file, if the "clean" wording is not detect, I'll category it as dirty file and I would like to trace the file name

Grace
  • 440
  • 3
  • 6
  • 21

4 Answers4

4

You can use a simple one-liner:

perl -0777 -nlwE 'say $ARGV if !/clean/i' *.txt

Slurping the file with -0777, making the regex check against the entire file. If the match is not found, we print the file name.

For perl versions lower than 5.10 that do not support -E you can substitute -E with -e and say $ARGV with print "$ARGV".

perl -0777 -nlwe 'print "$ARGV\n" if !/clean/i' *.txt
TLP
  • 66,756
  • 10
  • 92
  • 149
  • I tried the command seem not working. It give err msg "Unrecognized switch: -E (-h will show valid options).". I tried to eliminate the "E" from the command and it gives "/clean/i: Event not found." – Grace Jan 30 '13 at 05:49
  • @Grace The `-E` switch can be replaced with `-e`, its documented in `perl -h`. The only thing it does here is allow the `say` feature, which is equal to `print` with a newline. So `print "$ARGV\n"` and `-e`. However, what perl version are you using when you do not have access to `-E`? – TLP Jan 30 '13 at 06:07
  • I'm using old Perl v5.8.5, the system admin refuse to update to latest :(. I have tried using the -e too, but it give error "/clean: Event not found.". But I'm sure the file without the "clean" wording is there in my checking area. – Grace Jan 30 '13 at 06:11
  • @Grace Sounds like a shell quoting problem, you missed a quotation mark somewhere, perhaps. – TLP Jan 30 '13 at 06:15
  • I use this command "perl -0777 -nlwe 'print "$ARGV\n" if !/clean/i' *", it give error "/clean/i: Event not found." – Grace Jan 30 '13 at 06:23
  • Use backticks to post code, not quotation marks. Be sure you surround the code with single quotes and you should be able to put anything in there without the shell interfering. If you're really struggling, a quick hack is to replace double quotes with `qq()`. – TLP Jan 30 '13 at 06:29
2

If you need to generate the list within Perl, the File::Finder module will make life easy.

Untested, but should work:

use File::Finder;

my @wanted = File::Finder              # finds all         ..
              ->type( 'f' )            # .. files          ..
              ->name( '*.txt' )        # .. ending in .txt ..
              ->in( '.' )              # .. in current dir ..
              ->not                    # .. that do not    ..
              ->contains( qr/clean/ ); # .. contain "clean"

print $_, "\n" for @wanted;

Neat stuff!

EDIT:

Now that I have a clearer picture of the problem, I don't think any module is necessary here:

use strict;
use warnings;

my @files = glob '*.txt';  # Dirty & clean laundry

my @dirty;

foreach my $file ( @files ) {     # For each file ...

    local $/ = undef;             # Slurps the file in
    open my $fh, $file or die $!;

    unless ( <$fh> =~ /clean/ ) { # if the file isn't clean ..
        push @dirty, $file;       # .. it's dirty
    }

    close $fh;
}

print $_, "\n" for @dirty;        # Dirty laundry list

Once you get the mechanics, this can be simplified a la grep, etc.

Zaid
  • 36,680
  • 16
  • 86
  • 155
  • Thanks Zaid, but my system not support the finder module. We're using old perl version v5.8.5 :( – Grace Jan 30 '13 at 06:08
  • @Grace : It should work on your Perl version. You may not have it installed, which is a different story. If you have rights you can install it using `cpan File::Finder`. If you don't, I'll refer you to [how you can work around it](http://stackoverflow.com/q/2526804/133939) – Zaid Jan 30 '13 at 06:30
  • File::Finder is a wrapper to File::Find, it says in the documentation. File::Find is a core module in perl 5, so why not just use that? – TLP Jan 30 '13 at 06:31
  • @TLP : Convenience more than anything. `File::Find` can be quite unwieldy for a requirement like this (imagine what `\&wanted` would look like here) – Zaid Jan 30 '13 at 06:34
  • `perl -MFile::Find -lwe'find(sub { -f && /\.txt$/ && push(@ARGV, $File::Find::name) }, shift); undef $/; while (<>) { print $ARGV if !/clean/i }' /path/to/dir` perhaps – TLP Jan 30 '13 at 06:41
  • @TLP : Not sure about whether the `<>` will work. I'd `open my $fh, $File::Find::name or die` and `close $fh` inside the sub as well. Regardless, it's quite a lot of code for what is essentially a simple requirement – Zaid Jan 30 '13 at 06:46
  • Thanks for all suggestions but I some how still fail to get it done, looking for the string pattern is ez but I can't believe looking for an invert will cause me this much trouble:( – Grace Jan 30 '13 at 06:53
  • @Grace : If you show us what you have so far it'll be much easier to pinpoint the issue – Zaid Jan 30 '13 at 06:55
  • @Grace Well, you know, if shell quoting is screwing you up, you can always just put the code in a file, and run it without code and `-e` switch. E.g. `perl -0777 -nlw file.pl *.txt` – TLP Jan 30 '13 at 06:56
  • I have no example but I can describe the case: I have a list of files, inside each file is having numerous of lines. If the file is clean, it will have the "clean" wording. But if the file is dirty, the "clean" wording not exist and there will be no clear indication to tell the file is dirty. So as long as inside each file, if the "clean" wording is not detect, I'll category it as dirty file and I would like to trace the file name. – Grace Jan 30 '13 at 07:04
  • @Grace : This is much clearer than your original posting. Please edit the question and put your new wording in. – Zaid Jan 30 '13 at 07:07
  • Thanks Zaid, your edited code solved my prblm. Could you help to explain the $/ and undef seperately? Also, I'm interest to know else option to get same solution if not slurping the whole file if comes to a situation whereby the file size is too big., if you're not hesitate. Also, want to take chance to thanks all of you who have response to me, as I really greatly appreciate your helps! – Grace Jan 30 '13 at 08:08
  • This is exactly what my solution did too, except in fewer lines of code. :) – TLP Jan 30 '13 at 10:04
  • @TLP, Thank you TLP, maybe I'm to dumb to detect that. Appreciate your help! – Grace Jan 31 '13 at 05:12
0
#!/usr/bin/perl


use strict;
use warnings;

open(FILE,"<file_list_file>");
while(<FILE>)
{
my $flag=0;
my $filename=$_;
open(TMPFILE,"$_");
        while(<TMPFILE>)
        {
         $flag=1 if(/<your_string>/);
        }
    close(TMPFILE);
    if(!$flag)
        {
        print $filename;
        }
}
close(FILE);
Vijay
  • 65,327
  • 90
  • 227
  • 319
0

One way like this:

ls *.txt | grep -v "$(grep -l clean *.txt)"
Guru
  • 16,456
  • 2
  • 33
  • 46
  • This give "Illegal variable name". I don't think can use the "v" though since that the file contain of a lot of lines which not having "clean" by it self. – Grace Jan 30 '13 at 06:26
  • added the quotes..btw,-v is used against the list of files, not on the file content as such. – Guru Jan 30 '13 at 06:35
  • It still give Iellegal variable name: > ls * | grep -v "$(grep -l clean *)" Illegal variable name. – Grace Jan 30 '13 at 07:12