2

I want to sort an arrayref %results (Time-strings, from old to new), it has multiple keys but I just posted one key to show how it looks like:

'Ende Monatswechsel P-Konten' => [
                                         '17.02.2018 05:17:39',
                                         '14.02.2018 04:28:11',
                                         '23.02.2018 03:17:17',
                                         '22.02.2018 03:39:20',
                                  ]

I am expecting:

    'Ende Monatswechsel P-Konten' => [
                                         '14.02.2018 04:28:11',
                                         '17.02.2018 05:17:39',
                                         '22.02.2018 03:39:20',
                                         '23.02.2018 03:17:17',
                                  ]

Does any know how to do this? I tried:

my $columns = map [ $_, sort{$a <=> $b} @{ $results{$_} } ], keys %results;

but it doesn't work. Thanks in advance.

My code looks like this:

while(my $line=<F>) {
    #- Info: 19.02.2018 00:01:01 --- Start Tageswechsel-CoBa ---
    #- Info: 27.11.2018 04:16:42 --- Ende Tageswechsel-CoBa ---
            if ($line=~ /(\d\d\.\d\d\.\d\d\d\d \d\d:\d\d:\d\d) --- (.+? Tageswechsel-CoBa) -.*\s*$/)
            {
                    ($timestamp, $action) = ($1,$2);
            }
            if ( !defined $filter{$action}{$timestamp} ) {
                    push @{$results{$action}}, $timestamp;
                    $filter{$action}{$timestamp} = 1;
            }
}

print Dumper(\%results) outputs:

'Start Tageswechsel-CoBa' => [
                                '17.02.2018 05:12:13',
                                '20.02.2018 04:23:16',
                                '22.02.2018 03:12:46',
                                '23.02.2018 03:34:28',
                                '27.02.2018 03:41:25',
                                '02.03.2018 03:32:26',
            ],
'Ende Tageswechsel-CoBa' => [
                                    '17.02.2018 05:20:01',
                                    '19.02.2018 06:01:02',
                                    '20.02.2018 04:29:44',
                                    '22.02.2018 03:19:04',
                                    '23.02.2018 03:40:52',
                                    '26.02.2018 06:01:26',
            ]
            };
Dave Cross
  • 68,119
  • 3
  • 51
  • 97
Unsal
  • 81
  • 5
  • Define what you mean by "it doesn't work"; how doesn't it work? What output to do you get and what output are you expecting? – Chris Turner Nov 26 '18 at 13:10
  • I expect: `'Ende Monatswechsel P-Konten' => [ '14.02.2018 04:28:11', '17.02.2018 05:17:39', '22.02.2018 03:39:20', '23.02.2018 03:17:17', ]` – Unsal Nov 26 '18 at 13:19
  • 2
    The usual approach is to split up the string and compare the parts. The better approach is to force your upstream system to output the date as `YYYY-mm-dd HH:MM:SS`. Then you can use string comparisons. – Corion Nov 26 '18 at 13:26
  • @Unsal: Ok, now I'm completely confused. Your original post strongly implied that you wanted to sort an array that was stored in a hash. So that's what my code did. Now you've posted code that builds up the array inside the hash (and, as far as I can see, builds it in the correct order). So, I really don't know what you're asking. – Dave Cross Nov 27 '18 at 15:30
  • @Dave: sorry for not having made it precise enough. Yes it is an arrayref and what you see is just a piece from the output. From first glance it looks like it is sorted already but I have like >100 values where dates are unsorted in that arrayref. – Unsal Nov 27 '18 at 20:32
  • @Unsal: Ok, the easiest option is probably to create your hash of arrays in the same way as you currently do and then go and sort the arrays afterwards. The code in my answer will do that (see the version at the end of your answer which tries to use your data structures). – Dave Cross Nov 27 '18 at 22:23

3 Answers3

2

Something like this would work:

#!/usr/bin/perl

use strict;
use warnings;
use feature 'say';

use Data::Dumper;

my $data = [
  '17.02.2018 05:17:39',
  '14.02.2018 04:28:11',
  '23.02.2018 03:17:17',
  '22.02.2018 03:39:20',
];

my @sorted = sort {
  my @a = split /[\. ]/, $a;
  my @b = split /[\. ]/, $b;
  return (
    $a[2] <=> $b[2] or  # year
    $a[1] <=> $b[1] or  # month
    $a[0] <=> $b[0] or  # day of month
    $a[3] cmp $b[3]     # time
  );
} @$data;

say Dumper @sorted;

I'm splitting each value into chunks and then sorting them from largest chunk to smallest. Note that as the time is a string, not a number I use cmp instead of <=>.

This is slightly inefficient, as I'm re-splitting each data item several times. If that's a problem, then you could look at something like a Schwartzian Transform.

But the best solution to this would be to get a sortable timestamp in the first place. If your dates were YYYY.MM.DD HH:MM:SS, then you could just do a simple string sort.

Update: My output is

$ perl sortdate
$VAR1 = '14.02.2018 04:28:11';
$VAR2 = '17.02.2018 05:17:39';
$VAR3 = '22.02.2018 03:39:20';
$VAR4 = '23.02.2018 03:17:17';

Update 2: I've edited my code to make it more like your example. Hope this helps.

#!/usr/bin/perl

use strict;
use warnings;
use feature 'say';

use Data::Dumper;

my %results = (
  'Ende Monatswechsel P-Konten' => [
    '17.02.2018 05:17:39',
    '14.02.2018 04:28:11',
    '23.02.2018 03:17:17',
    '22.02.2018 03:39:20',
  ]
);

foreach my $k (keys %results) {
  my @sorted = sort {
    my @a = split /[\. ]/, $a;
    my @b = split /[\. ]/, $b;
    return (
      $a[2] <=> $b[2] or  # year
      $a[1] <=> $b[1] or  # month
      $a[0] <=> $b[0] or  # day of month
      $a[3] <=> $b[3]     # time
    );
  } @{ $results{$k} };

  $results{$k} = \@sorted;
}

say Dumper \%results;

And the output...

$VAR1 = {
          'Ende Monatswechsel P-Konten' => [
                                             '14.02.2018 04:28:11',
                                             '17.02.2018 05:17:39',
                                             '22.02.2018 03:39:20',
                                             '23.02.2018 03:17:17'
                                           ]
        };
Dave Cross
  • 68,119
  • 3
  • 51
  • 97
  • Hi Dave, thanks for the answer. I am not sure how schwartzian transform should work here, because i have fix length of date strings. – Unsal Nov 27 '18 at 09:30
  • Btw your suggestion did not work it returns: `$VAR1 = [ '17.02.2018 05:17:39', '23.02.2018 03:17:17', '14.02.2018 04:28:11', '22.02.2018 03:39:20' ];` – Unsal Nov 27 '18 at 10:37
  • @Unsal: Doesn't look like you're running my actual code. My code returns an array, but you're dumping an array reference. – Dave Cross Nov 27 '18 at 10:55
  • @Unsal: My code takes an array reference (`$data`) as input and returns an array (`@sorted`). Edit your question to add the current state of your code and I'll take a look. – Dave Cross Nov 27 '18 at 13:24
  • In my Code I push my timestamps into a hash (%results) like this: `push @{$results{$action}}, $timestamp;` but I am not able to use your code with my arrayref. – Unsal Nov 27 '18 at 14:02
  • @Unsal: As I said before "Edit your question to add the current state of your code and I'll take a look". Exchanging brief snippets in comments really isn't helpful. – Dave Cross Nov 27 '18 at 14:24
  • I have edited my question and put the code snippet in there. – Unsal Nov 27 '18 at 15:20
1

Splitting the strings and comparing the parts is appropriate for sorting many types of "multipart" values, however since you are dealing with datetimes, you can use the core module Time::Piece to turn the strings into datetime objects which can be compared using the <=> operator.

Time::Piece provides the strptime method, which parses a date string into a Time::Piece object using a format string. Time::Piece objects can be compared using numerical comparison operators.

use v5.10;
use strict
use warnings;
use Time::Piece;

my @vals = (
    '17.02.2018 05:17:39',
    '14.02.2018 04:28:11',
    '23.02.2018 03:17:17',
    '22.02.2018 03:39:20',
);

say for sort {dt($a) <=> dt($b)} @vals;

###

sub dt {
    my $str = shift;
    return Time::Piece->strptime($str,'%e.%m.%Y %H:%M:%S') 
}
beasy
  • 1,227
  • 8
  • 16
  • If it's really just a few elements this is great but for lists of any greater length I'd suggest to actually use the Schwartzian transform, since `Time::Piece` isn't that cheap. It's still going to be expensive but incomparably less so. – zdim Nov 28 '18 at 07:06
  • Fair point. Personally I would choose brevity and readability over optimization unless speed is critical. – beasy Nov 28 '18 at 13:22
  • Agreed. I just wanted to note that `Time::Piece` is a little expensive. (It came as a surprise to me; once I had to drop it and move to manual parsing as it was adding too much overhead. But that code had _a lot_ to process.) – zdim Nov 28 '18 at 18:49
  • I am surprised too because it's a core module, I assumed it was a fast C implementation, but thanks for the insight. Then again, a feature of the Schwartzian transform is efficiency (due to comparing only what it needs to before short-circuiting), so perhaps it's not that surprising. – beasy Nov 28 '18 at 19:50
  • Sorry, I may have not stated that clearly: I found the module to be too slow (for long, long lists in time-hungry code), without the transform; there was no sorting, just `Time::Piece` was taking longer than I expected. (Thus I mentioned the transform here, since with sorting `TP->strptime` runs for _every_ comparison; with the transform it runs once for each list element and the cached objects are then just used in comparisons.) – zdim Nov 28 '18 at 20:14
  • To add: I am pretty sure that `Time::Piece`'s code is fast, as you say -- the problem is that `strptime` runs a constructor; so a full object is built every time. I think that _that_ is what kills the efficiency, and what made it unusable for me in long loops. (The fact that `strptime` is a _class method_ also adds to trouble, since in the method resolution process first all instance methods are searched and only then class methods.) I couldn't find a way to "set" an existing object without running `strptime` (so to avoid the constructor). – zdim Nov 30 '18 at 19:04
  • Why can't you do `$t = Time::Piece->new; $t->strptime($_,"%Y-%M-%d") for @dates`? – beasy Nov 30 '18 at 22:22
  • Indeed, can do that (so it's a dual-purpose method) ... can't recall how exactly it went; perhaps it was just still slow and I'm wrong with blaming the constructor now, or perhaps I didn't think of trying this (since docs only mention class-method invocation). Will look/test again, for when it comes up the next time. Thanks! – zdim Nov 30 '18 at 22:44
0

I actually used Dave's approach now (since I don't have the module Time::Piece installed) slightly different but it works now, not sure though about the efficency:

my @array;
my @sorted;
my %aref_n;

for my $key ( keys %results ) {
    for my $i (0..$#{ $results{$key} }) {
            push @array, $results{$key}[$i];
    }

    @sorted = sort {
            my @a = split /[\. ]/, $a;
            my @b = split /[\. ]/, $b;
            return (
                    $a[2] <=> $b[2] or
                    $a[1] <=> $b[1] or
                    $a[0] <=> $b[0] or
                    $a[3] cmp $b[3]
                    );
            } @array;

    $aref_n{$key} = [ @sorted ];
    @array=();

}

Unsal
  • 81
  • 5