21

I have a regex:

/abc(def)ghi(jkl)mno(pqr)/igs

How would I capture the results of each parentheses into 3 different variables, one for each parentheses? Right now I using one array to capture all the results, they come out sequential but then I have to parse them and the list could be huge.

@results = ($string =~ /abc(def)ghi(jkl)mno(pqr)/igs);
ekad
  • 14,436
  • 26
  • 44
  • 46
Incognito
  • 1,883
  • 5
  • 21
  • 28

6 Answers6

18

Your question is a bit ambiguous to me, but I think you want to do something like this:

my (@first, @second, @third);
while( my ($first, $second, $third) = $string =~ /abc(def)ghi(jkl)mno(pqr)/igs) {
    push @first, $first;
    push @second, $second;
    push @third, $third;
}
Leon Timmermans
  • 30,029
  • 2
  • 61
  • 110
  • 3
    that's a bit long winded. when captured, you can use back references – ghostdog74 Feb 14 '10 at 01:23
  • 4
    ghostdog74: that's a matter of taste. If you really name your variables $first and $second then you might as well use $1 and $2 indeed, but if you give them more descriptive names then it can improve readability to do it like this. – Leon Timmermans Feb 14 '10 at 01:26
  • 2
    -1. I have to agree with ghostdog74; capturing to the $1 .. series of variables is just cleaner in modern Perl. While you *can* do it, doesn't mean it's probably the best way to do it. – Robert P Feb 14 '10 at 02:33
  • 2
    @leon ,true, but since he is going to put them in arrays anyway, what you really care is the array name. who doesn't know what $1, $2 .. is? – ghostdog74 Feb 14 '10 at 02:36
  • This answer is unfortunately incorrect. The `while` loop in this answer will loop infinitely if `$string` matches (due to the list context inside the `while` expression). – YenForYang Jun 09 '21 at 20:41
10

Starting with 5.10, you can use named capture buffers as well:

#!/usr/bin/perl

use strict; use warnings;

my %data;

my $s = 'abcdefghijklmnopqr';

if ($s =~ /abc (?<first>def) ghi (?<second>jkl) mno (?<third>pqr)/x ) {
    push @{ $data{$_} }, $+{$_} for keys %+;
}

use Data::Dumper;
print Dumper \%data;

Output:

$VAR1 = {
          'first' => [
                       'def'
                     ],
          'second' => [
                        'jkl'
                      ],
          'third' => [
                       'pqr'
                     ]
        };

For earlier versions, you can use the following which avoids having to add a line for each captured buffer:

#!/usr/bin/perl

use strict; use warnings;

my $s = 'abcdefghijklmnopqr';

my @arrays = \ my(@first, @second, @third);

if (my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {
    push @{ $arrays[$_] }, $captured[$_] for 0 .. $#arrays;
}

use Data::Dumper;
print Dumper @arrays;

Output:

$VAR1 = [
          'def'
        ];
$VAR2 = [
          'jkl'
        ];
$VAR3 = [
          'pqr'
        ];

But I like keeping related data in a single data structure, so it is best to go back to using a hash. This does require an auxiliary array, however:

my %data;
my @keys = qw( first second third );

if (my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {
    push @{ $data{$keys[$_]} }, $captured[$_] for 0 .. $#keys;
}

Or, if the names of the variables really are first, second etc, or if the names of the buffers don't matter but only order does, you can use:

my @data;
if ( my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {
    push @{ $data[$_] }, $captured[$_] for 0 .. $#captured;
}
mklement0
  • 382,024
  • 64
  • 607
  • 775
Sinan Ünür
  • 116,958
  • 15
  • 196
  • 339
  • Are you just trying to do a deep copy in that first example? I'd just pull out Storable's dclone. Either that, or your example needs some looping to build up that values you store in `$data`. :) – brian d foy Feb 22 '10 at 00:29
  • @brian I was thinking of parsing a file where each line gives you a `first` and a `second` and a `third` value and storing those values in their own arrays. Compare with Leon Timmerman's example ( http://stackoverflow.com/questions/2259784/how-can-i-store-captures-from-a-perl-regular-expression-into-separate-variables/2259795#2259795 ) – Sinan Ünür Feb 22 '10 at 01:24
3

An alternate way of doing it would look like ghostdog74's answer, but using an array that stores hash references:

my @results;
while( $string =~ /abc(def)ghi(jkl)mno(pqr)/igs) {
    my ($key1, $key2, $key3) = ($1, $2, $3);
    push @results, { 
        key1 => $key1,
        key2 => $key2,
        key3 => $key3,
    };
}

# do something with it

foreach my $result (@results) {
    print "$result->{key1}, $result->{key2}, $result->{key3}\n";
}

with the main advantage here of using a single data structure, AND having a nice readable loop.

Robert P
  • 15,707
  • 10
  • 68
  • 112
2

@OP, when parenthesis are captured, you can use the variables $1,$2....these are backreferences

$string="zzzabcdefghijklmnopqrsssszzzabcdefghijklmnopqrssss";
while ($string =~ /abc(def)ghi(jkl)mno(pqr)/isg) {
    print "$1 $2 $3\n";
}

output

$ perl perl.pl
def jkl pqr
def jkl pqr
ghostdog74
  • 327,991
  • 56
  • 259
  • 343
  • 3
    Note his use of the g modifier. He's doing a global match, so I assume he wants to store multiple matches. – Leon Timmermans Feb 14 '10 at 01:29
  • 2
    Also, $1 and so on are not "backreferences", they are captures. Parentheses and backreferences are *related*, however. – jrockway Feb 14 '10 at 10:11
0

You could have three different regex's each focusing on specific groups. Obviously, you would like to just assign different groups to different arrays in the regex, but I think your only option is to split the regex up.

joejoeson
  • 1,107
  • 1
  • 10
  • 14
0

You can write a regex containing named capture groups. You do this with the ?<myvar> construct at the beginning of the capture group:

/(?<myvar>[0-9]+)/

You may then refer to those named capture groups using a $+{myvar} form.

Here is a contrived example:

perl -ne '/^systemd-(?<myvar>[^:]+)/ && { print $+{myvar} . "\n"}' /etc/passwd

Given a typical password file, it pulls out the systemd users and returns the names less the systemd prefix. It uses a capture group named myvar. This is just an example thrown together to illustrate the use of capture group variables.

starfry
  • 9,273
  • 7
  • 66
  • 96