Convert hash values with same key to hash of arrays in Perl

Question

I need to convert the hash into hash of array in perl

I have:

%hash = (
    tinku => 15,
    tina  => 4,
    rita  => 18,
    tinku => 18,
    tinku => 17,
    tinku => 16,
    rita  => 19
);

And I want to change it to:

%hash =  ( tinku => [ 15, 16, 17, 18 ], rita => [ 18, 19 ], tina => 4 );

Your first hash is invalid, so y'know, that's just not going to work. — Sobrique, Feb 18 '15 at 12:00
this is just an example can you please tell me the way to convert the hash into hash of array(hash will have same key but different value and i need to form as hahses of array) — idiot on perl, Feb 18 '15 at 12:08
I think this is a useful question and there are some good standard answers here *i.e* it is a useful question and set of responses. Even if the OP is using "pseudo code" the point of the question gets through. — G. Cito, Feb 18 '15 at 15:24
For a hash key to refer to multiple values, the values need to be not just a list or array but a anonymous array `[ ]` or a reference `\@arr`. — G. Cito, Feb 18 '15 at 15:25

score 5 · Answer 1 · answered Feb 18 '15 at 12:33

my %hash = (tinku =>15,tina =>4, rita =>18, 
    tinku =>18, tinku =>17, tinku =>16, rita =>19);

This assignment is going to only keep the last value for each key (i.e. tinku=>16, rita=>19, tina=>4) and dismiss the previous ones. This is done so deliberately to allow overriding values in hash assignments. E.g.

sub some_function {
     my %args = (%sane_defaults, @_);
};

Also, (foo => (1, 2, 3)) would create hash (foo => 1, 2 => 3) and not what you expect.

A possible solution could be:

use strict;
use warnings;
use Data::Dumper;

my @array = (tinku =>15,tina =>4, rita =>18, tinku =>18, 
     tinku =>17, tinku =>16, rita =>19);
my %hash = hash_of_arrays( @array );
print Dumper(\%hash);

sub hash_of_arrays {
     die "Odd number of elements in hash (of arrays) assignment"
          if @_ % 2;
     # I never understood why this is a *warning* :-)

     # populate hash by hand
     my %hash; 
     while (@_) {
          my $key = shift;
          my $value = shift;
          push @{ $hash{$key} }, $value;
          # here hash values automatically become 
          # empty arrayrefs if not defined, thanks Larry
     };
     return %hash; 
     # *tecnically*, this one returns *array* 
     # and converts it back to hash
};

Some [research](http://perlmonks.org/?node_id=1117094) on the nature of Perl's native "Odd number of elements in hash assignment" warning. — Dallaylaen, Feb 19 '15 at 11:06
++ for your approach to failing quickly when the keys and values of a hash add up to something odd (so to speak). I wonder if there's other ways they might not line up properly.and if some of the new functions in `List::Util` could make those easy to detect. — G. Cito, Jun 22 '15 at 16:25

score 5 · Answer 2 · edited May 23 '17 at 12:23

The techniques and patterns covered in the other responses here are tried and true idioms that are essential for getting the most out of Perl, for understanding existing code, and for working with the large installed base of older perl compilers. Just for fun I thought I mention a couple of other approaches:

There's a fairly readable new syntax in perl-5.22 that is an alternative to the more classic approach take by @fugu. For something a bit more funky I'll mention @miyagawa's Hash::MultiValue. Perl 6 also has a nice way to convert lists of key/value pairs with potentially non-unique keys into hashes containing keys with multiple values.

As the other responses here point out, the "key" to all of this is:

For a hash key to refer to multiple values, the values need to be not just a list or array but a anonymous array [ ] or a reference.

Using new syntax available with `perl-5.22`

Fugu's response shows the standard Perl idiom. Iterating through @names using for 0 .. $#names ensures that overlapping keys are not "lost" and instead point at an anonymous array of multiple values. With perl-5.22 we can use the pairs() function from List::Util (a core module) and postfix dereferencing to add key/value pairs to a hash and account for overlapping or duplicate keys in a slightly different way:

use experimental qw(postderef);
use List::Util qw/pairs/;

my %hash;    
my $a_ref = [ qw/tinku 15 tina 4 rita 18 tinku 18 tinku 17 tinku 16 rita 19/ ];
push $hash{$_->key}->@* , $_->value for pairs @$a_ref;

use DDP;
p %hash;

As of version 1.39 List::Util::pairs() returns ARRAY references as blessed objects accessible via ->key and ->value methods. The example uses LEONT's experimental.pm pragma and DDP to make things a bit more compact.

Output:

{
    rita    [
        [0] 18,
        [1] 19
    ],
    tina    [
        [0] 4
    ],
    tinku   [
        [0] 15,
        [1] 18,
        [2] 17,
        [3] 16
    ]
}

As to which is more "readable": it's hard to beat the easily "grokable" standard approach, but with the new syntax available in the latest versions of perl5 we can explore the potential of new idioms. I am really starting to like postfix dereferencing. TIMTOWTDI and beyond!

@miyagawa's `Hash::MultiValue`

With this module all you can create a Hash::MultiValue object (with lots of methods to access it in various ways) and a plain hash reference to conveniently work with multiple values per key.

#!/usr/bin/env perl -l
use Hash::MultiValue;
use strict;
use warnings;

my $mvhash = Hash::MultiValue->new(tinku =>15, tina =>4, rita =>18,
                tinku =>18, tinku =>17, tinku =>16, rita =>19);

print "\ntinku's values:\n", join " ", $mvhash->get_all('tinku');

print "\nflattened mvhash:\n", join " ", $mvhash->flatten ;

print "\n ... using mvhash as a hashref:" ;
print join " ", $mvhash->get_all($_) for keys %$mvhash ;

print "\n", '... as a "mixed" hashref with each():';
my $mvhash_ref = $mvhash->mixed ;

while ( my ($k, $v) = each $mvhash_ref ) { 
  print "$k => " , ref $v eq "ARRAY" ? "@{$v}" : "$v" ; 
}

Output:

tinku's values:
15 18 17 16

flattened mvhash:
tinku 15 tina 4 rita 18 tinku 18 tinku 17 tinku 16 rita 19

... using mvhash as a hashref:
15 18 17 16
18 19
4

... as a "mixed" hashref with each():
tinku => 15 18 17 16
rita => 18 19
tina => 4

Once your hash is available as a Hash::MultiValue object you can manipulate it in various ways to quickly create temporary copies and hash references. Just assign them to a scalar and Dump the (or use DDP) to get an idea of how it works:

use DDP; 
my $hmulti = $mvhash->multi; p $hmulti ;
my $hmixed = $mvhash->mixed; p $hmixed

There's some restrictions on using regular hash operations with a Hash::MultiValue object (and things like dd \$mvhash are not going to show you the whole hash - you need to do dd $hash->multi) however in some situations there is an advantage to working with multi-value hashes in this way (i.e. more readable and/or possibly less code needed for some functions).

You still need to recognize when/where Hash::MultiValue is useful so it's not unambiguously "easier" or "cleaner" - but it's another useful addition to your box of perl tools.

Perl 6 - just for comparison

Perl6 can be a bit more compact for grabbing key/value pairs from a list because you can use "multiple parameters" in a for statement, traversing a list by groups of elements then using push to arrange them into a hash. You can do this in a way that "automagically" accounts for overlapping keys. cf. this short perl6 snippet:

my %h ;
for <tinku 15 tina 4 rita 18 tinku 18 tinku 17 tinku 16 rita 19> -> $k, $v { 
    %h.push($k => $v) ;
}
%h.perl.say ;

Edit: The friendly folks on #perl6 suggest an even more succinct "method":

my %h.push: <tinku 15 tina 4 rita 18 tinku 18 tinku 17 tinku 16 rita 19>.pairup ;
%h.perl.say ;

Output:

{:rita(["18", "19"]), :tina("4"), :tinku(["15", "18", "17", "16"])}<>

cf.

It's not just continued development of perl the compiler that makes it possible to write Perl code in new and interesting ways. Thanks to @miygawa and Paul Evans for his stewardship of Scalar-List-Utils you can do cool things with Hash::MultiValue even if your version of perl is as old as version 5.8; and you can try the functions available in latest versions of List::Util even if your perl is barely from this millennium (List::Util works with perl-5.6 which ushered in the 21st century in March 2000).

So here's the *third* solution to an *impossible* problem. Cool. — Dallaylaen, Feb 19 '15 at 11:03

score 3 · Answer 3 · edited Feb 18 '15 at 15:26

You're asking for the impossible! Hashes can only have unique keys, so in your example you will produce a hash which takes each unique name as its key, and the last value for each key as its value:

#!/usr/bin/perl
use warnings;
use strict; 
use Data::Dumper;

my %hash = (tinku =>15,tina =>4, rita =>18, 
           tinku =>18, tinku =>17, tinku =>16, rita =>19);

print Dumper \%hash;

$VAR1 = {
          'rita' => 19,
          'tina' => 4,
          'tinku' => 16
        };

To make a hash of arrays you could try something like this:

my %hash;

my @names = qw(tinku tina rita tinku tinku tinku rita);
my @nums = qw(15 4 18 18 17 16 19);


push @{ $hash{ $names[$_] } }, $nums[$_] for 0 .. $#names;


print Dumper \%hash;

$VAR1 = {
          'rita' => [
                      '18',
                      '19'
                    ],
          'tina' => [
                      '4'
                    ],
          'tinku' => [
                       '15',
                       '18',
                       '17',
                       '16'
                     ]
        };

Simplest and leanest approach ++ .... I expanded my answer to compare it with potential of "new" syntax in 5.22 and the perl6 approach. — G. Cito, Jun 22 '15 at 16:20

score 2 · Answer 4 · answered Feb 18 '15 at 12:11

2

You can't have that hash in the first place. A hash in Perl must have unique keys.

answered Feb 18 '15 at 12:11

Chankey Pathak

21,187
12
85
133

They must be unique ***or*** there must be some way of dealing with collisions/overlap in the keys so that multiple values can be assigned to each key. From the wording of question I think this is what the OP was looking for. – G. Cito Jun 22 '15 at 19:01

score 2 · Answer 5 · edited Feb 19 '15 at 01:27

2

Since a hash can only have unique keys, don't assign list to a hash, but process it with pairs() from List::Util,

use List::Util 'pairs';

my %hash;
push @{ $hash{$_->[0]} }, $_->[1]
 for pairs (tinku =>15,tina =>4, rita =>18, tinku =>18, 
           tinku =>17, tinku =>16, rita =>19);

use Data::Dumper; print Dumper \%hash;

output

$VAR1 = {
      'tinku' => [
                   15,
                   18,
                   17,
                   16
                 ],
      'rita' => [
                  18,
                  19
                ],
      'tina' => [
                  4
                ]
    };

edited Feb 19 '15 at 01:27

G. Cito

6,210
3
29
42

answered Feb 18 '15 at 13:26

mpapec

50,217
8
67
127

1

Neat. I thought there might be something in List::Util for that, still did it by hand. – Dallaylaen Feb 18 '15 at 13:30
@Sugir_R_Thain ... `pairs()` "returns a list of ARRAY references, each containing two items from the given list". Because they are array references (*i.e.* `[ ]`) you **can** have a hash of "lists" as you indicate - just not quite the way your code is written. – G. Cito Feb 19 '15 at 01:34

Convert hash values with same key to hash of arrays in Perl

5 Answers5

Using new syntax available with perl-5.22

@miyagawa's Hash::MultiValue

Perl 6 - just for comparison

Using new syntax available with `perl-5.22`

@miyagawa's `Hash::MultiValue`