85

Suppose I have:

my $string = "one.two.three.four";

How should I play with context to get the number of times the pattern found a match (3)? Can this be done using a one-liner?

I tried this:

my ($number) = scalar($string=~/\./gi);

I thought that by putting parentheses around $number, I'd force array context, and by the use of scalar, I'd get the count. However, all I get is 1.

Relequestual
  • 11,631
  • 6
  • 47
  • 83
Geo
  • 93,257
  • 117
  • 344
  • 520

9 Answers9

132

That puts the regex itself in scalar context, which isn't what you want. Instead, put the regex in list context (to get the number of matches) and put that into scalar context.

 my $number = () = $string =~ /\./gi;
friedo
  • 65,762
  • 16
  • 114
  • 184
  • 4
    Well, perlsecret does propose "Saturn" as an alternate name. :) – oalders Jan 23 '15 at 22:13
  • 1
    Can someone explain this bit of code to me? I'm new to perl and I'm still not really comfortable with contexts. – Edward Gargan Oct 09 '18 at 22:12
  • The first part, `() = $string =~ /\./gi`, make the match operator return the results of the match in a list context. This is similar to `my @results = $string =~ /\./gi;`. Next, the `my $number` part is a scalar value. Assigning the results of the list context to a scalar returns its length. This is the same as `my $count = @some_list`, which returns the length of the array. My answer below is another way of visualizing the behavior here. – Robert P Oct 27 '20 at 16:30
37

I think the clearest way to describe this would be to avoid the instant-cast to scalar. First assign to an array, and then use that array in scalar context. That's basically what the = () = idiom will do, but without the (rarely used) idiom:

my $string = "one.two.three.four";
my @count = $string =~ /\./g;
print scalar @count;
Robert P
  • 15,707
  • 10
  • 68
  • 112
24

Also, see Perlfaq4 :

There are a number of ways, with varying efficiency. If you want a count of a certain single character (X) within a string, you can use the tr/// function like so:

$string = "ThisXlineXhasXsomeXx'sXinXit";
$count = ($string =~ tr/X//);
print "There are $count X characters in the string";

This is fine if you are just looking for a single character. However, if you are trying to count multiple character substrings within a larger string, tr/// won't work. What you can do is wrap a while() loop around a global pattern match. For example, let's count negative integers:

$string = "-9 55 48 -2 23 -76 4 14 -44";
while ($string =~ /-\d+/g) { $count++ }
print "There are $count negative numbers in the string";

Another version uses a global match in list context, then assigns the result to a scalar, producing a count of the number of matches.

$count = () = $string =~ /-\d+/g;
Robert P
  • 15,707
  • 10
  • 68
  • 112
9

Is the following code a one-liner?

print $string =~ s/\./\./g;
Mike
  • 1,841
  • 5
  • 24
  • 34
7

Try this:

my $string = "one.two.three.four";
my ($number) = scalar( @{[ $string=~/\./gi ]} );

It returns 3 for me. By creating a reference to an array the regular expression is evaluated in list context and the @{..} de-references the array reference.

zb226
  • 9,586
  • 6
  • 49
  • 79
PP.
  • 10,764
  • 7
  • 45
  • 59
1

I noticed that if you have an OR condition in your regular expression (eg /(K..K)|(V.AK)/gi ) then the array produced may have undefined elements which are included in the count at the end.

For example:

my $seq = "TSYCSKSNKRCRRKYGDDDDWWRSQYTTYCSCYTGKSGKTKGGDSCDAYYEAYGKSGKTKGGRNNR";
my $regex = '(K..K)|(V.AK)';
my $count = () = $seq =~ /$regex/gi;
print "$count\n";

Gives a value of count of 6.

I found the solution in this post How do I remove all undefs from array?

my $seq = "TSYCSKSNKRCRRKYGDDDDWWRSQYTTYCSCYTGKSGKTKGGDSCDAYYEAYGKSGKTKGGRNNR";
my $regex = '(K..K)|(V.AK)';
my @count = $seq =~ /$regex/gi;
@count = grep defined, @count; 
my $count = scalar @count;
print "$count\n";

Which then gives the correct answer of three.

-1

another way,

my $string = "one.two.three.four";
@s = split /\./,$string;
print scalar @s - 1;
ghostdog74
  • 327,991
  • 56
  • 259
  • 343
-1

Friedo's method is: $a = () = $b =~ $c.

But it's possible to simplify this even further to just ($a) = $b =~ $c, like so :

my ($matchcount) = $text =~ s/$findregex/ /gi;

You could thank just wrap this up in a function, getMatchCount(), and not worry about it destroying the passed string.

On the other hand, you can add in a swap, which may be a bit more computation, but does not result in altering the string.

my ($matchcount) = $text =~ s/($findregex)/$1/gi;
HoldOffHunger
  • 18,769
  • 10
  • 104
  • 133
  • Except that this is a substitution, not a match: it will destroy the original string. And it is the same idea as @Mike had 6 years earlier. – fishinear Jan 09 '17 at 11:51
  • 1
    @fishinear: This is very different than Mike. He was capable of printing it, but not storing it to a variable. The difference is significant. – HoldOffHunger May 09 '18 at 02:29
  • 1
    If you need nondestructive, just s/(regex)/$1/g or /(=regex)//g if you like living dangerously. – android.weasel Jun 19 '18 at 16:13
  • @android.weasel Oh, hey, good point! Updating with that remark. I normally wrap stuff like this in functions, so I myself don't have to worry about the destructability of passed args (not sure which is faster, because now it's doing a swap). But that is useful info, adding! – HoldOffHunger Jun 19 '18 at 16:26
-1
my $count = 0;
my $pos = -1;
while (($pos = index($string, $match, $pos+1)) > -1) {
  $count++;
}

checked with Benchmark, it's pretty fast