1

I am trying to match and replace in multiple files some string using

local $/;
open(FILE, "<error.c");
$document=<FILE>;
close(FILE);
$found=0;
while($document=~s/([a-z_]+)\.h/$1_new\.h/gs){
    $found=$found+1;
};
open(FILE, ">error.c");
print FILE "$document";
close(FILE);'

It enters an endless loop, because the result of the substitution is matched again by the regular expression searched for. But shouldn't this be avoided by the s///g construct?

EDIT:

I found that also a foreach loop will not do exactly what I want (it will replace all occurrences, but print only one of them). The reason seems to be that the perl substitution and and search behave quite differently in the foreach() and while() constructs. To have a solution to replace in multiple files which outputs also all individual replacements, I came up with the following body:

# mandatory user inputs
my @files;
my $subs;
my $regex;

# additional user inputs
my $fileregex = '.*';
my $retval = 0;
my $opt_testonly=0;

foreach my $file (@files){

    print "FILE: $file\n";
    if(not($file =~ /$fileregex/)){
        print "filename does not match regular expression for filenames\n";
        next;
    }

    # read file
    local $/; 
    if(not(open(FILE, "<$file"))){ 
        print STDERR "ERROR: could not open file\n"; 
        $retval = 1; 
        next; 
    };
    my $string=<FILE>; 
    close(FILE); 

    my @locations_orig;
    my @matches_orig;
    my @results_orig;

    # find matches
    while ($string =~ /$regex/g) {
        push @locations_orig, [ $-[0], $+[0] ];
        push @matches_orig, $&;
        my $result = eval("\"$subs\"");
        push @results_orig, $result;
        print "MATCH: ".$&." --> ".$result." @[".$-[0].",".$+[0]."]\n";
    }

    # reverse order
    my @locations = reverse(@locations_orig);
    my @matches = reverse(@matches_orig);
    my @results = reverse(@results_orig);

    # number of matches
    my $length=$#matches+1;
    my $count;

    # replace matches
    for($count=0;$count<$length;$count=$count+1){
        substr($string, $locations[$count][0], $locations[$count][1]-$locations[$count][0]) = $results[$count];
    }

    # write file
    if(not($opt_testonly) and $length>0){
        open(FILE, ">$file"); print FILE $string; close(FILE);
    }

}

exit $retval;

It first reads the file creates lists of the matches, their positions and the replacement text in each file (printing each match). Second it will replace all occurrences starting from the end of the string (in order not to change the position of previous messages). Finally, if matches were found, it writes the string back to the file. Can surely be more elegant, but it does what I want.

highsciguy
  • 2,569
  • 3
  • 34
  • 59
  • ... If I'm not mistaken, the `s///g` construct does *all* your replacements in one go, not one at a time. You shouldn't need a loop there at all. In other news: Why not `sed`? – FrankieTheKneeMan May 23 '13 at 15:10
  • Yes, you are absolutely right. The reason for the loop is that I want to count the number of matches as indicated (and possibly in future output what was matched). I do not use sed because I want exact perl syntax. The code will be part of a shell script. – highsciguy May 23 '13 at 15:14
  • 2
    To get the number of matches, you can do `$num_matches = ($data =~ s/([a-z_]+)\.h/$1_new\.h/g)` – Aleks G May 23 '13 at 15:19
  • Nice one, @AleksG. I was going to point him to http://stackoverflow.com/questions/1849329/is-there-a-perl-shortcut-to-count-the-number-of-matches-in-a-string – FrankieTheKneeMan May 23 '13 at 15:27
  • Thanks! The output of matches cannot be implemented this way, can it? Mainly I am confused, because I always thought that a `while(s///g){}` steps forward in the substitution process, i.e. will not replace previous matches again, or is this just for `while(m//g){}` or for `foreach`? – highsciguy May 23 '13 at 15:28
  • Doing some tests, I found that indeed simply replacing `while` with `foreach` in the above code will do. That will be my choice because it gives me the flexibility to output the matches. Obviously these two behave differently in this respect, other than you might expect. Length does not matter much here because I shall use it in a script. – highsciguy May 23 '13 at 15:45
  • `s///g` *does* step forward in the substitution process. When it is finished, it will return a _true_ value, which will cause the `while` to go into another iteration. – Squeezy May 23 '13 at 17:10
  • I see, `foreach` will not depend on the return value. – highsciguy May 23 '13 at 17:51

3 Answers3

3

$1_new will still match ([a-z_]+). It enters an endless loop because you use while there. With the s///g construct, ONE iteration will replace EVERY occurence in the string.

To count the replacements use:

$replacements = () = $document =~ s/([a-z_]+)\.h/$1_new\.h/gs;

$replacements will contain the number of replaced matches.

If you essentially just want the matches, not the replacements:

@matches = $document =~ /([a-z_]+)\.h/gs;

You can then take $replacement = scalar @matches to obtain their count.

Squeezy
  • 495
  • 4
  • 11
1

I'd say you're over-engineering this. I did this in the past with:

perl -i -p -e 's/([a-z_]+)\.h/$1_new\.h/g' error.c

This works correctly when the substituted string contains the matching pattern.

Aleks G
  • 56,435
  • 29
  • 168
  • 265
0

the /g option is like a loop in itself. I think you want this:

while($document=~s/([a-z_]+)(?!_new)\.h/$1_new\.h/s){
    $found=$found+1;
};

Because you are replacing the match with itself and more, you need a negative lookahead assertion.

k-h
  • 396
  • 1
  • 3
  • 5