One way: use the /r
modifier
print for sort {
foo( $a =~ s/^STRING//r ) cmp foo( $b =~ s/^STRING//r )
} @ary
with which the substitution operator s/
returns the modified string and leaves the original unchanged. If there is no match the original string is returned what seems to fit the purpose. See the end for an optimization if this is used on large arrays, or if the function call takes time.
Another way would be to change the substitution to match-and-capture, when
feasible.
There is an extensive discussion of this in sort. First, $a
and $b
are (package) globals
The $a
and $b
are set as package globals in the package the sort() is called from
The block { }
stands for an anonymous subroutine and
... the elements to be compared are passed into the subroutine as the package global variables $a
and $b
so the aliasing implies that changing them affects the elements. Thus
The values to be compared are always passed by reference and should not be modified.
what does happen when $a
, $b
get changed so the elements change.
In the second case you copy $a
and $b
into (what should be!) lexicals $x
and $y
and the connection with @ary
is broken so the elements aren't changed.
Please always have use warnings;
and use strict;
at the beginning of the program. This is an excellent, if extreme, example -- it could matter a lot whether the variables you introduce to try things out are global ($x
) or lexical (my $x
).
Code that processes elements to use the resulting value for sort comparison has an efficiency flaw. The comparison is done on two elements at a time, so elements get processed multiple times. And each time we do the same processing and compute the same value for an element.
This inefficiency is notable only for large enough data sets and most of the time one need not worry. But in this case a regex engine runs and there is also a function call involved, and those aren't exactly cheap in Perl. Also, the call's overhead is unspecified and I imagine that there is some work involved.
To optimize this one can pre-process the input and then sort
my @sorted_ary =
map { $_->[1] }
sort { $a->[0] cmp $b->[0] }
map { [ foo( s/^STRING//r ), $_ ] }
@ary;
The map
that takes the input @ary
applies the regex and the function call and stores that result, along with the original element, in a two-element arrayref, for each element of @ary
. This list of arrayrefs is then sort
ed, using the first arrayref element for comparison. The last map
extracts the second element from sorted arrayrefs, thus returning the original items sorted as needed.
This is called Schwartzian transform. See "Examples" in sort, for example.
One should keep in mind that the gains are notable only for large enough data while this maneuver comes with an overhead as well (and in a far more complex code). So consider using it only when there is a demonstrable problem of this kind with sorting.