2

Earlier today I posted a similar question, whose solution leads to a new problem, -,-

Well, the story is that I want Perl to capture comments from a text, store them in array, and replace them with new numbered comments, say, for original $txt:

//first comment
this is a statement //second comment
//third comment
more statements //fourth comment

I wanna push the 4 comments into an array, and get new $txt like:

//foo_0
this is a statement //foo_1
//foo_2
more statements //foo_3

I tried the following Perl:

$i=0;
$j=0;
#while ($txt =~ s/(\/\/.*?\n)/\/\/foo_$i\n/gs) {
#while ($txt =~ s/(\/\/.*?\n)/\/\/foo_$i\n/s) {
#foreach ($txt =~ s/(\/\/.*?\n)/\/\/foo_$i\n/gs) {
foreach ($txt =~ s/(\/\/.*?\n)/\/\/foo_$i\n/s) {
        if(defined $1) {
                push (@comments, $1);
                print " \$i=$i\n";
                $i++
                }
        print " \$j=$j\n";
        $j++;
        }

print "after search & replace, we have \$txt:\n";
print $txt;

foreach (0..$#comments) {
        print "\@comments[$_]= @comments[$_]";
        }

In it, I tried the "while/foreach (... s///gs)" in four flavors, but none of them actually did what I want.

The "foreach" statement will work on the text only once; and more worse, the "while" statement will enter endless loop, seems like the new "//foo_xx" stuff is put back into the string for further search operations, making it an infinite iteration. It's so strange that such a seemingly simple search-and-replace mechanism would get mired in endless loop, or there're some obvious tricks that I don't know of?

BTW, I already went through the post by highsciguy . For him, "simply replacing while with foreach in the above code will do"; but for me the foreach just does not work, I don't know why.

Anyone get any ideas in helping me with this? Thanks~

Community
  • 1
  • 1
katyusza
  • 325
  • 2
  • 12
  • 3
    You should always use [`use strict; use warnings;`](http://stackoverflow.com/questions/8023959/why-use-strict-and-warnings), because knowing about your mistakes is better than not knowing about them. – TLP Feb 02 '16 at 13:22
  • Yeah, 3q for your advice; I just enjoyed the loose & flexibke syntax of perl, nevered realized the importance of ' strict warnings ' stuff. I'll try it. – katyusza Feb 02 '16 at 16:42
  • 2
    When you start using it, you will start to understand what you are doing, and you will understand why things do not work sometimes. Honestly, working without them turned on is like working blindfolded. – TLP Feb 02 '16 at 17:01
  • Right. I'll try to stick to it. 3q~ – katyusza Feb 03 '16 at 01:31

2 Answers2

2

I'd tackle it a bit differently - a while loop to read a filehandle line by line, and 'grab' all the comment lines out of it.

Something like this:

#!/usr/bin/perl

use warnings;
use strict;

my @comments; 

#iterate stdin or filename specified on command line
while ( <> ) { 
   #replace anything starting with // with foo_nn
   #where nn is current number of comments. 
   s,//(.*),"//foo_".@comments,e && push (@comments, $1 );
   #$1 is the contents of that bracket - the string we replaced
   #stuff it into commments; 

   #print the current line (altered by the above)
   print;
}
#print the comments. 
print "Comments:\n", join "\n", @comments;

Doesn't address duplicates, and will break if you've got // in quotes or something, but does work for your example. while iterates based on a filehandle, line by line. If you've got a scalar with your text blob already, then you can accomplish the same thing with foreach ( split ( "\n", $text ) ) {

Output:

//foo_0
this is a statement //foo_1
//foo_2
more statements //foo_3
Comments:
first comment
second comment
third comment
fourth comment
Community
  • 1
  • 1
Sobrique
  • 52,974
  • 7
  • 60
  • 101
  • Whats `.@comments,e;` do ? – 123 Feb 02 '16 at 14:41
  • the `e` regex modifier is eval. It `evals` `@comments`. The `.` concatenates it, and forces a scalar context, and thus it 'returns' a number equal to the number of elements - thus if `@comments` has 4 elements, it'll return `//foo_4` as the replace. – Sobrique Feb 02 '16 at 14:46
  • @syck - worked with the sample data, but yes - seems to do that if there's a blank line. (presumably because with no match, but `$1` is still defined). Amended. – Sobrique Feb 02 '16 at 14:48
  • And its not even possible to unset it – syck Feb 02 '16 at 15:00
  • Nice, doing the search-and-replace ONE line at a time really breaks the endless loop! But why is this? Using s/// on entire text file (with newlines) falls into endless recursive search loop, while on ONE line it doesn't. Why? – katyusza Feb 02 '16 at 16:55
  • In the `while` loop, it iterates until a condition is false. If you're replacing one thing that the regex matches (`//commment` with `//foo`) then the regex will _continue_ to match, treating `//foo` as a new comment to add to the list. A `foreach` loop shouldn't in quite the same way though. – Sobrique Feb 02 '16 at 17:30
  • Does this mean the `while` or `foreach` keyword out of `{ s/// }` will have an effect on the behavior of s/// ? We're both using `s/xx/yy/` stuff; the only difference is that you run it on a line-basis, while I run it on a file-basis (multiline). So it seems like, if `while` gets entire file, then `s/xx/yy/s` falls into endless-loop (my code); but if `while` gets only one line, then `s/xx/yy/s` is save and good (your code). Am I guessing it right? – katyusza Feb 03 '16 at 02:30
  • My while exit condition is hitting end of file. Yours is when there are no comments left. This would be fine if you weren't replacing one comment with another. – Sobrique Feb 03 '16 at 07:26
1

Iterate over every line of the text, and if replacement is successful, store the comment:

#!/usr/bin/perl

use strict;
use warnings;

my $txt = <<END;                        # define text
//first comment
this is a statement //second comment
//third comment
more statements //fourth comment
END

my @comments = ();
my $i = 0;
foreach (split qq(\n), $txt) {          # iterate over input lines
        if (s&(//.*)&//foo_$i&) {       # do we match?
                push @comments, $1;     # then push comment
                $i++;                   # and increase counter
                }
        print;                          # print modified row
        print qq(\n);                   # print newline
        }

print qq(\nComments:\n);
foreach (@comments) {
        print;                          # print the comment
        print qq(\n);                   # print newline
        }
syck
  • 2,984
  • 1
  • 13
  • 23
  • 3q~:-) I just don't understand, why is it that breaking input file into individual lines prevents the s/// stuff from being stuck in infinite loop? And, besides, what is the benefit of qq() over "" ? – katyusza Feb 02 '16 at 17:34
  • As @sobrique stated above, using the while will re-match with the modified comment. Use `@matches = $txt =~ //g` to be on the safe side; @matches can then used to iterate over or whatever. Using qq() gives you the possibility to insert quotation marks without escaping them (does not apply here), the rest is just personal flavour. Sorry for my late reply, I did not attend SO for a couple of days. – syck Feb 08 '16 at 18:27
  • Got that, thanks for your careful explanations! And sorry for this late reply; the Spring Festival has put everything in the country into a halt for quite a long time. – katyusza Feb 17 '16 at 02:16