0

I have an array that contains values like this:

@array = 
("2014 Computer Monitor 200",
"2010 Keyboard 30",
"2012 Keyboard 80",
"2011 Study Desk 100");

How would I use regular expressions in Perl to sort the entire array by year, item name, and price? For example, if the user wants to sort by price they type 'price' and it sorts like this:

    2010 Keyboard 30
    2012 Keyboard 80
    2011 Study Desk 100
    2014 Computer Monitor 200

So far I've been able to sort by year like this:

    @array = 
    ("2014 Computer Monitor 200",
    "2010 Keyboard 30",
    "2012 Keyboard 80",
    "2011 Study Desk 100");
    
    $input = ;
    
    chomp($input);
    if ($input eq "year")
    {
        foreach $item (sort {$a cmp $b} @array)
        {
        print $item . "\n";
        }
    }
toolic
  • 57,801
  • 17
  • 75
  • 117

2 Answers2

1

/(\d+) \s+ (.+) \s+ (\S+)/x will match year name and price,

use strict;
use warnings;

my $order = "price";
my @array = (
  "2014 Computer Monitor 200",
  "2010 Keyboard 30",
  "2012 Keyboard 80",
  "2011 Study Desk 100"
);

my %sort_by = (
  year  => sub { $a->{year}  <=> $b->{year} },
  price => sub { $a->{price} <=> $b->{price} },
  name  => sub { $a->{name}  cmp $b->{name} },
);
@array = sort {

  local ($a, $b) = map {
    my %h; 
    @h{qw(year name price)} = /(\d+) \s+ (.+) \s+ (\S+)/x;
    \%h;
  } ($a, $b);
  $sort_by{$order}->();

} @array;

# S. transform
# @array =
#  map { $_->{line} }
#  sort { $sort_by{$order}->() }
#  map { 
#    my %h = (line => $_); 
#    @h{qw(year name price)} = /(\d+) \s+ (.+) \s+ (\S+)/x;
#    $h{name} ? \%h : ();
#  } @array;

use Data::Dumper; print Dumper \@array;

output

$VAR1 = [
      '2010 Keyboard 30',
      '2012 Keyboard 80',
      '2011 Study Desk 100',
      '2014 Computer Monitor 200'
    ];
mpapec
  • 50,217
  • 8
  • 67
  • 127
  • 1
    Your answer isn't much help to a perl newbie because there's no explanation of what it does. They will copy and paste your code but won't learn anything from it. – i alarmed alien Sep 27 '14 at 08:00
  • @ialarmedalien I hope there will be additional questions so I don't have to explain things which OP is already familiar with. – mpapec Sep 27 '14 at 08:16
  • 3
    I admire your optimism, but I would guess that a lot of newbs don't look for further explanation if they don't understand what the code is doing--they'll copy it and then come back with similar questions. Schwartzian transforms are way beyond the average newbie. – i alarmed alien Sep 27 '14 at 08:36
  • Thank you for your solution. I'll will try to understand each line for code. – Programmer Student Sep 27 '14 at 14:34
  • Is there an alternative way to solve this without the use of Schwartzian transforms? – Programmer Student Sep 27 '14 at 15:10
  • @ProgrammingStudent Yes -- you will need to put your data into a more complex data structure that allows you to identify which part of, say, `2010 Keyboard 30` is the year, which the name, and which the price. Hashes are good for this kind of information storage and retrieval, but you will have to work out how to store a hash holding the information for each line. [perldsc](http://perldoc.perl.org/perldsc.html) may be useful here... – i alarmed alien Sep 27 '14 at 16:52
  • 1
    I have been looking at the script above and trying to work out how it works. I think it works as below. Can anyone confirm if I'm on the right lines. I think I may have to repost this old question to get a reply, but I'll try here first. – John D Jun 03 '21 at 09:36
  • @mpapec This is how I thought it worked. I deleted it from this post – John D Jun 08 '21 at 21:15
  • I think the rest of the script works like this: • The initial sort on line 14 takes in items from @array two at a time, one in $a and one in $b • The map function then takes items $a and $b and maps each to a hash - each item becomes a hash with keys 'year', 'price', and 'name. This is based on the regex /(\d+) \s+ (.+) \s+ (\S+)/x – John D Jun 08 '21 at 21:16
  • 1
    Map returns the two hashes, as references, to local variables $a and $b • I think it is necessary to use local $a and $b otherwise sort will use the default $a and $b taken in at the start of the sort on line 17? – John D Jun 08 '21 at 21:16
  • 1
    The 'price' sort function is stored as an coderef in the %sort_by hash • This is called at line 26 by the code $sort_by{$order}->() on the local versions of $a and $b – John D Jun 08 '21 at 21:17
0

Using a sort without a transform:

use strict;
use warnings;

my @array = ( "2014 Computer Monitor 200", "2010 Keyboard 30", "2012 Keyboard 80", "2011 Study Desk 100" );

my $order = "price";

my @sorted = sort {
    local ( $a, $b ) = map { /^(?<year>\d+) \s+ (?<name>.*) \s (?<price>\d+)/x ? {%+} : die "Can't parse $_" } ( $a, $b );
    ($order ne 'name' ? $a->{$order} <=> $b->{$order} : 0) || $a->{name} cmp $b->{name}
} @array;

print "$_\n" for @sorted;

Outputs:

2010 Keyboard 30
2012 Keyboard 80
2011 Study Desk 100
2014 Computer Monitor 200

Note: If one is sorting more than 10k items, efficiency might become a concern and one can utilize a https://en.wikipedia.org/wiki/Schwartzian_transform

Miller
  • 34,962
  • 4
  • 39
  • 60
  • If possible, can you please explain what these 2 lines are doing? my @sorted = sort { local ( $a, $b ) = map { /^(?\d+) \s+ (?.*) \s (?\d+)/x ? {%+} : die "Can't parse $_" } ( $a, $b ); ($order ne 'name' ? $a->{$order} <=> $b->{$order} : 0) || $a->{name} cmp $b->{name} } @array; – Programmer Student Oct 01 '14 at 12:31
  • Uses [Named Backreferences](http://perldoc.perl.org/perlretut.html#Named-backreferences) to translate the values into a hash of its parts. Then compares based off the parts. – Miller Oct 03 '14 at 23:33
  • You absolutely can sort this without a transform, but the idea of there transform is to not repeatedly do the same work over and over again. If the array has many items, this quickly slows down. – brian d foy Jun 03 '21 at 15:13
  • @briandfoy Yes, if efficiency is of primary concern (ie if one is sorting more than 10k items), then adding the complexity of a transform is advised. I've added a message at the end to recommend such. However, in a case like this, I feel demonstrating the simpler construct of basic sorting is the better teaching approach. – Miller Jun 05 '21 at 18:34
  • In a lot of cases, a cached key sort starts to out perform simple sorts in arrays with only tens of items. In my Learning Perl classes, I have students sort files by modification time (using `-M`), and it's common to see that it's around 40 files where a cached key starts to do better. – brian d foy Aug 24 '21 at 05:25
  • @briandfoy Yes, if the comparison relies on IO or a file operation, then it certainly can be become slower at much smaller scales. However, one of the hardest things to teach is how readability and maintainability are more important than premature optimizations. Yes, algorithms are fun. And STransforms can be fun. But premature optimizations can easily lead to bugs and less maintainable code. So I personally would not choose to teach that to beginners without such a warning. – Miller Aug 26 '21 at 05:59