15

I have the following Perl script counting the number of Fs and Ts in a string:

my $str = "GGGFFEEIIEETTGGG";
my $ft_count = 0;
$ft_count++ while($str =~ m/[FT]/g);
print "$ft_count\n";

Is there a more concise way to get the count (in other words, to combine line 2 and 3)?

cjm
  • 61,471
  • 9
  • 126
  • 175
Daniel Standage
  • 8,136
  • 19
  • 69
  • 116

4 Answers4

28
my $ft_count = $str =~ tr/FT//;

See perlop.

If the REPLACEMENTLIST is empty, the SEARCHLIST is replicated. This latter is useful for counting characters in a class …

  $cnt = $sky =~ tr/*/*/;     # count the stars in $sky
  $cnt = tr/0-9//;            # count the digits in $_

Here's a benchmark:

use strict; use warnings;

use Benchmark qw( cmpthese );

my ($x, $y) = ("GGGFFEEIIEETTGGG" x 1000) x 2;

cmpthese -5, {
    'tr' => sub {
        my $cnt = $x =~ tr/FT//;
    },
    'm' => sub {
        my $cnt = ()= $y =~ m/[FT]/g;
    },
};
        Rate     tr      m
     Rate     m    tr
m   108/s    --  -99%
tr 8118/s 7440%    --

With ActiveState Perl 5.10.1.1006 on 32 Windows XP.

The difference seems to be starker with

C:\Temp> c:\opt\strawberry-5.12.1\perl\bin\perl.exe t.pl
      Rate      m     tr
m   88.8/s     --  -100%
tr 25507/s 28631%     --
Sinan Ünür
  • 116,958
  • 15
  • 196
  • 339
  • this should be `my $ft_count = $str =~ tr/FT/FT/;`, otherwise it will remove the characters from `$str` – Hasturkun Sep 28 '10 at 14:38
  • It counts the brackets. Must be `tr/FT//` – Toto Sep 28 '10 at 14:44
  • 3
    Of course tr/// isn't a regex so *technically* it doesn't answer the specific question :-) It's much better then using a regexp though. – ishnid Sep 28 '10 at 14:49
  • 4
    Your benchmark is letting the "m" case get away with just finding the first match, since the regex match is in scalar context. If I fix that line to "my $cnt = () = $y =~ m/[FT]/g;", "tr" ends up around 3000% better than "m" (on my Linux box). Incidentally, the original code is about twice as fast as "m". – aschepler Sep 28 '10 at 15:21
  • 3
    @Sinan +1 for suggesting `tr///`. I think your benchmark has a bug. In order to count replacements with regex, you need an intervening list context: `my $cnt = ()= $y =~ m/[FT]/g;`. When you run it that way, `tr///` is much faster than `m//`. I'm also on v5.10 under ActivePerl. – FMc Sep 28 '10 at 15:24
  • @FM Geez. I could have sworn I had typed everything correctly. Thank you also @aschepler. My sanity has been restored. – Sinan Ünür Sep 28 '10 at 15:44
  • 2
    @Sinan Ünür This is why it is a good idea to have a test section before the benchmark. I normally stuff the lambdas to be tested into a hash, iterate over the hash printing its return value, and then performing then benchmark. If any of the values differ, then I know I have a bad benchmark. – Chas. Owens Sep 29 '10 at 14:17
9

When the "m" operator has the /g flag AND is executed in list context, it returns a list of matching substrings. So another way to do this would be:

my @ft_matches = $str =~ m/[FT]/g;
my $ft_count = @ft_matches; # count elements of array

But that's still two lines. Another weirder trick that can make it shorter:

my $ft_count = () = $str =~ m/[FT]/g;

The "() =" forces the "m" to be in list context. Assigning a list with N elements to a list of zero variables doesn't actually do anything. But then when this assignment expression is used in a scalar context ($ft_count = ...), the right "=" operator returns the number of elements from its right-hand side - exactly what you want.

This is incredibly weird when first encountered, but the "=()=" idiom is a useful Perl trick to know, for "evaluate in list context, then get size of list".

Note: I have no data on which of these are more efficient when dealing with large strings. In fact, I suspect your original code might be best in that case.

aschepler
  • 70,891
  • 9
  • 107
  • 161
8

Yes, you can use the CountOf secret operator:

my $ft_count = ()= $str =~ m/[FT]/g;
Chas. Owens
  • 64,182
  • 22
  • 135
  • 226
0

You can combine line 2, 3 and 4 into one like so:

my $str = "GGGFFEEIIEETTGGG";
print $str =~ s/[FT]//g; #Output 4;
Mike
  • 1,841
  • 5
  • 24
  • 34
  • 2
    Being a comment on another answer, this would be better as a comment than an answer :) – ysth Sep 28 '10 at 15:45
  • @ysh,thanks for the comment. I didn't realize my answer is actually a comment on another answer, is it? The OP asks [Is there a more concise way to get the count (in other words, to combine line 2 and 3) and here's my answer to the question. Had someone already mentioned what I suggested? – Mike Sep 29 '10 at 01:32
  • @ysth, the original post is a possible duplicate of this [http://stackoverflow.com/questions/1849329/is-there-a-perl-shortcut-to-count-the-number-of-matches-in-a-string/1850686#1850686] and I posted a similar solution to that question. I think this post can be combined with that one. – Mike Sep 29 '10 at 01:49