78

What does $1 mean in Perl? Further, what does $2 mean? How many $number variables are there?

brian d foy
  • 129,424
  • 31
  • 207
  • 592
Chad DeShon
  • 4,732
  • 6
  • 28
  • 29
  • 4
    You might do well to check out something like _Learning Perl_ or other introduction to Perl that explains the very basics of the language. – brian d foy Jun 24 '09 at 17:26
  • 1
    Now Brian, why would you be recommending that book? The Monks are a charity after all.... – John White Jul 15 '17 at 17:18

8 Answers8

80

The $number variables contain the parts of the string that matched the capture groups ( ... ) in the pattern for your last regex match if the match was successful.

For example, take the following string:

$text = "the quick brown fox jumps over the lazy dog.";

After the statement

$text =~ m/ (b.+?) /;

$1 equals the text "brown".

user13107
  • 3,239
  • 4
  • 34
  • 54
rlbond
  • 65,341
  • 56
  • 178
  • 228
  • 2
    what if there will be more than one matches? can's we somehow get all the matches? – user1289 Jan 21 '16 at 13:45
  • @user1289 that's what the other numbers ($2, $3, ...) are for. – Alecg_O Feb 22 '19 at 22:27
  • @user1289 - Your regex needs multiple parenthesis to capture multiple values. You can't just use one capture criteria (ex m/ (b.?) /) to grab all words in a string that start with "b" (the prior example will only capture the first instance). – DemiImp Dec 17 '20 at 16:33
  • 1
    I think it's also important to mention that all capture variables ($1, $2, etc) are locally scoped. Once you leave the block, they are undefined. – DemiImp Dec 17 '20 at 16:36
38

The number variables are the matches from the last successful match or substitution operator you applied:

my $string = 'abcdefghi';

if ($string =~ /(abc)def(ghi)/) {
    print "I found $1 and $2\n";
}

Always test that the match or substitution was successful before using $1 and so on. Otherwise, you might pick up the leftovers from another operation.

Perl regular expressions are documented in perlre.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Jim Puls
  • 79,175
  • 10
  • 73
  • 78
  • 1
    If $1 through $9 are always there, what is there value if there were less than nine matches? – Chad DeShon Jun 24 '09 at 04:09
  • 5
    $1 through $9 are not always there. I think Jim misread the man page. I am quoting the relevant section: "You may have as many parentheses as you wish. If you have more than 9 substrings, the variables $10, $11, ... refer to the corresponding substring. Within the pattern, \10, \11, etc. refer back to substrings if there have been at least that many left parentheses before the backreference. Otherwise (for backward compatibility) \10 is the same as \010, a backspace, and \11 the same as \011, a tab. And so on. (\1 through \9 are always backreferences.)" – Alan Haggai Alavi Jun 24 '09 at 04:37
  • 2
    -1 for too many half-truths. They're the result of captures, not the match as a whole. They're only set on successful matches, which means you need to check whether or not the match was successful before using them. As Alan already pointed out, you've confused the special case behavior of the alternate backslash notation. – Michael Carman Jun 24 '09 at 13:45
  • 6
    I've completely replaced the answer with only the truth. Almost everything in the original answer was wrong. – brian d foy Jun 24 '09 at 17:23
13

$1, $2, etc will contain the value of captures from the last successful match - it's important to check whether the match succeeded before accessing them, i.e.

 if ( $var =~ m/( )/ ) { # use $1 etc... }

An example of the problem - $1 contains 'Quick' in both print statements below:

#!/usr/bin/perl

'Quick brown fox' =~ m{ ( quick ) }ix;
print "Found: $1\n";

'Lazy dog' =~ m{ ( quick ) }ix;
print "Found: $1\n";
plusplus
  • 1,992
  • 15
  • 22
10

As others have pointed out, the $x are capture variables for regular expressions, allowing you to reference sections of a matched pattern.

Perl also supports named captures which might be easier for humans to remember in some cases.

Given input: 111 222

/(\d+)\s+(\d+)/

$1 is 111

$2 is 222

One could also say:

/(?<myvara>\d+)\s+(?<myvarb>\d+)/

$+{myvara} is 111

$+{myvarb} is 222

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Einstein
  • 4,450
  • 1
  • 23
  • 20
6

These are called "match variables". As previously mentioned they contain the text from your last regular expression match.

More information is in Essential Perl. (Ctrl + F for 'Match Variables' to find the corresponding section.)

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
John T
  • 23,735
  • 11
  • 56
  • 82
3

Since you asked about the capture groups, you might want to know about $+ too... Pretty useful...

use Data::Dumper;
$text = "hiabc ihabc ads byexx eybxx";
while ($text =~ /(hi|ih)abc|(bye|eyb)xx/igs)
{
    print Dumper $+;
}

OUTPUT:
$VAR1 = 'hi';
$VAR1 = 'ih';
$VAR1 = 'bye';
$VAR1 = 'eyb';

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Hemanth Gowda
  • 604
  • 4
  • 16
0

The variables $1 .. $9 are also read only variables so you can't implicitly assign a value to them:

$1 = 'foo'; print $1;

That will return an error: Modification of a read-only value attempted at script line 1.

You also can't use numbers for the beginning of variable names:

$1foo = 'foo'; print $1foo;

The above will also return an error.

user118435
  • 173
  • 4
0

I would suspect that there can be as many as 2**32 -1 numbered match variables, on a 32-bit compiled Perl binary.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Brad Gilbert
  • 33,846
  • 11
  • 78
  • 129