Please be careful with the switch statement which is highly experimental
As previously mentioned, the "switch" feature is considered highly experimental; it is subject to change with little notice. In particular, when
has tricky behaviours that are expected to change to become less tricky in the future. Do not rely upon its current (mis)implementation. Before Perl 5.18, given
also had tricky behaviours that you should still beware of if your code must run on older versions of Perl.
These are tricky and will change.
Having said that, one way to count words in a string is to split it first
use warnings;
use strict;
use feature 'switch';
my $file = '...';
open my $fh, '<', $file or die "Can't open $file: $!";
while (my $line = <$fh>)
{
chomp $line;
my @words = split ' ', $line;
my $num_words = @words;
given ($num_words) {
when ($num_words > 2) {
# ...
}
}
}
close $fh;
what uses the fact that a scalar ($num_words
) when assigned an array (@words
) receives the number of elements of the array. See Context in perldata
Assignment is a little bit special in that it uses its left argument to determine the context for the right argument. Assignment to a scalar evaluates the right-hand side in scalar context, [...]
and an array evaluated in scalar context yields the number of its elements.
Here we can skip the array altogether†
my $num_words = split ' ', $line;
So in order to get the count without creating an array variable we need to directly assign to a scalar, but that isn't always going to yield the length of the list; putting the right-hand-side in scalar context -- by assignment to a scalar -- may affect how it operates and what it returns.
There are workarounds though. For example‡
my $num_words = () = $line =~ /\w+/g;
where the "operator" = () =
is a play on context, or
my $num_words = @{ [ $line =~ /\w+/g ] };
where the []
takes a reference to the list inside and is then derefenced by @{ }
, what just evaluates to a list regardless of context and so can be assigned to a scalar whereby such scalar assignment returns the number of elements in that list.§
See this page for a wealth of information about lists, arrays, scalars, and context.
† This can be done more compactly as
while (<$fh>) {
chomp;
my $num_words = split;
# ...
}
The default for while
, chomp
, and split
is the $_
variable. The split
also needs a pattern and the default is ' '
, so the above is the same as split ' ', $_
. The pattern ' '
is special for split
and matches any amount of any whitespace, also discarding leading and trailing space.
Note that once we assign to a variable inside the while
condition (like to the $line
in the main text) then the deal with $_
is off -- it is undef
. So either our variable or $_
. A reasonable rule of thumb is that if you end up using $_
more than once or twice then there should be a proper variable. And if ever in doubt, introduce a nice variable.
‡ Regex's match operator returns the actual matches when in list context but only true/false when in scalar context. (And, in scalar context that /g
doesn't make sense.)
§ Another example is split
, which returns the size of the list in scalar context.