perl - Hash::Merge duplicates same list within hashes instead of putting them once

Question

I'm trying to merge two hashes that hold lists inside them. The thing is that those lists are exactly the same, but because they are lists, the merger duplicates their values inside.

Any ideas how can I remove the duplication?

#!usr/bin/perl
use strict;
use warnings;
use Hash::Merge;
use Data::Dumper;
$Data::Dumper::Sortkeys = 1;

my $hash1 = {
                 
                 'Instance' => [ 1,2 ]
                 
               };
my $hash2 = {
                 'Instance' => [ 1,2 ] 
    };


my $merger = Hash::Merge->new('LEFT_PRECEDENT');    
my $hash3 = $merger->merge($hash2, $hash1);
print Dumper($hash3);

The output:

$VAR1 = {
          'Instance' => [
                          1,
                          2,
                          1,
                          2
                        ]
        };

What I want is:

$VAR1 = {
          'Instance' => [
                          1,
                          2
                        ]
        };

AFTER EDIT: I posted a continuing question.

score 2 · Accepted Answer · answered Sep 12 '21 at 09:01

2

As often when you want to do something a bit advanced with Hash::Merge, the answer is "implement your own custom behavior".

In this case, you can do:

my $merger = Hash::Merge->new('LEFT_PRECEDENT');
my $behavior = $merger->get_behavior_spec($merger->get_behavior);
$behavior->{ARRAY}{ARRAY} = sub {
    my ($left, $right) = @_;
    my %seen = map { $_ => 1 } @$left;
    return [ @$left, grep { ! $seen{$_} } @$right ];
};

my $hash3 = $merger->merge($hash2, $hash1);

Where the line my %seen = map { $_ => 1 } @$left; populates the hash %seen with the values of the $left array, and grep { ! $seen{$_} } @$right filters the $right array by keeping only the values that are not in %seen.

Note that this approach does not remove all duplicates: if $left or $right contain duplicate elements (eg, if $left = [1, 1, 2]), then those duplicates will remain. If you want to remove all duplicates, then use this version instead:

use List::MoreUtils qw(uniq);
$behavior->{ARRAY}{ARRAY} = sub {
    my ($left, $right) = @_;
    return [ uniq @$left, @$right ];
};

If, for any reason, you don't want to rely on List::MoreUtils for the uniq function, you can easily implement your own: How do I remove duplicate items from an array in Perl?.

answered Sep 12 '21 at 09:01

Dada

6,313
7
24
43

It works great on regular numbers, but for some reason when it's combined with [blessed](https://stackoverflow.com/questions/69114461/perl-deep-recursion-on-subroutine-hashmergemerge) example (which you gave answer to that too), it still duplicates. I understand there's a problem with the blessing, but I would have thought that after unbless works and $self->merge goes again, that it would run your change in this answer here and not duplicate the 'LineNumber' => bless( do{\(my $o = '200773952')}, 'Veri::ColLineFile' )... – urie Sep 12 '21 at 10:20
@urie I don't quite understand the issue. Any chance that changing the `LEFT_PRECEDENT` to `RETAINMENT_PRECEDENT` fixes your problem? – Dada Sep 12 '21 at 13:23
I added a new question [here](https://stackoverflow.com/questions/69174664/perl-hashmerge-duplicates-same-list-within-hashes-instead-of-putting-them-on) – urie Sep 14 '21 at 08:46
@urie Good call, it's better to add a new question than to substantially edit the old one. I know how to fix your issue; not sure if I'll have time to answer today though. – Dada Sep 14 '21 at 08:48

perl - Hash::Merge duplicates same list within hashes instead of putting them once

1 Answers1

Linked