1

My Perl script searches a directory of file names, using grep to output only file names without the numbers 2-9 in their names. That means, as intended, that file names ending with the number "1" will also be returned. However, I want to use the chop function to output these file names without the "1", but can't figure out how. Perhaps the grep and chop functions can be combined in one line of code to achieve this? Please advise. Thanks.

Here's my Perl script:

#!/usr/bin/perl
use strict;
use warnings;

my $dir = '/Users/jdm/Desktop/xampp/htdocs/cnc/images/plants';
opendir(DIR, $dir);
@files = grep (/^[^2-9]*\.png\z/,readdir(DIR));

foreach $file (@files) {
   print "$file\n";
}

Here's the output:

Ilex_verticillata.png
Asarum_canadense1.png
Ageratina_altissima.png
Lonicera_maackii.png
Chelone_obliqua1.png

Here's my desired output with the number "1" removed from the end of file names:

Ilex_verticillata.png
Asarum_canadense.png
Ageratina_altissima.png
Lonicera_maackii.png
Chelone_obliqua.png
Jeff
  • 179
  • 10
  • Tip: You should ALWAYS use `use strict; use warnings;` – ikegami Oct 08 '20 at 19:41
  • Tip: Unecessarily using global vars is a bad practice. Replace `opendir(DIR, $dir)` with `opendir(my $DIR, $dir)` (and replace later instances of `DIR` with `$DIR`). – ikegami Oct 08 '20 at 19:42
  • 1
    Tip: `opendir` is very likely to fail. It's best to have at least some minimal error checking. (`opendir(my $dh, $dir) or die("Can't open directory \"$dir\": $!\n");`) – ikegami Oct 08 '20 at 19:43
  • Thanks for the advice! – Jeff Oct 08 '20 at 19:50
  • Should the error checking code be incorporated into the code like this: opendir(my $DIR, $dir); (opendir(my $dh, $dir) or die("Can't open directory \"$dir\": $!\n");) – – Jeff Oct 08 '20 at 20:02
  • @Jeff I show that in my answer (yes) – zdim Oct 08 '20 at 20:17
  • I am confused now. The question implies, in my mind, that you want the `1` removed only when it's at the end of the filename. Is that so? Or do you want it removed from everywhere in the filename? (Then why not filter out 1-9?) – zdim Oct 08 '20 at 20:18
  • I want to find file names with the number "1" at the end of their names, but I want those file names to be outputted without that "1". For example, I want the find "Asarum_candense1" in the directory, but I want it outputted as "Asarum_candense" without the "1" at the end of the file name. Hope that clarifies. – Jeff Oct 08 '20 at 20:44
  • Alright, that's what I thought. But what about possible `1` in the middle of the name? Say, `hi1_no1.png` --- should this become `hi1_no.png` (with the first `1` kept) or `hi_no.png` (both `1`s removed) ? – zdim Oct 09 '20 at 00:04
  • (I goofed up with that "_why not filter out 1-9_" -- that's not what you want of course) – zdim Oct 09 '20 at 00:05
  • I only want to remove the number "1" from the end of file names, and the map/grep expression you provided does that perfectly. Thank you. – Jeff Oct 09 '20 at 11:56

2 Answers2

6

The number 1 to remove is at the end of the name before the extension; this is different from filtering on numbers (2-9) altogether and I wouldn't try to fit it into one operation.

Instead, once you have your filtered list (no 2-9 in names), then clip off that 1. Seeing that all names of interest are .png can simply use a regex

$filename =~ s/1\.png\z/.png/;

and if there is no 1 right before .png the string is unchanged. If it were possible to have other extensions involved then you should use a module to break up the filename.

To incorporate this, you can pass grep's output through a map

opendir my $dfh, $dir  or die "Can't open $dir: $!";

my @files = 
    map { s/1\.png\z/.png/r } 
    grep { /^[^2-9]*\.png\z/ } 
    readdir $dfh;

where I've also introduced a lexical directory filehandle instead of a glob, and added a check on whether opendir worked. The /r modifier on the substitution in map is needed so that the string is returned (changed or unchanged if regex didn't match), and not changed in place, as needed here.

This passes over the list of filenames twice, though, while one can use a straight loop. In principle that may impact performance; however, here all operations are done on each element of a list so a difference in performance is minimal.

zdim
  • 64,580
  • 5
  • 52
  • 81
  • 1
    Re "*This passes over the list of filenames twice, though. [...] that may impact performance*", Doing two loops that do one thing each is not meaningfully slower than doing one loop that does two things. – ikegami Oct 08 '20 at 19:46
  • Thanks for your suggestion, zdim! You guys make it look easy!!! – Jeff Oct 08 '20 at 19:48
  • @ikegami Since each operation is done on every element there's indeed not much of a difference. Adjusted text, thank you – zdim Oct 08 '20 at 20:00
5

You could use use the following:

s/1//g for @files;

It's also possible to integrate a solution into your chain using map.

my @files =
   map s/1//rg,
      grep /^[^2-9]*\.png\z/,
         readdir(DIR);
ikegami
  • 367,544
  • 15
  • 269
  • 518
  • Wow... the "map" expression works!!! I don't understand how, but it does! Thank you!!! – Jeff Oct 08 '20 at 19:45
  • 1
    From what the OP says it appears that they want `1` removed only when it's at the end of the filename, right before extension? They don't exclude the case where there may be more `1`s in the filename (and that seems possible since they first filter on 2-9 only) – zdim Oct 08 '20 at 19:48
  • `map` takes an expression and a list of scalars like `grep`. It applies a transformation to each of those scalars. – ikegami Oct 08 '20 at 19:48
  • @zdim, Probably, but then `^[^2-9]*` is probably "wrong" too. So I stuck with what they said rather than creating an inconsistency. – ikegami Oct 08 '20 at 19:52