0

I've got a script which contains 2 hashes and while printing out the contents I'm finding that the script is assigning a value to the 2nd hash without me doing it. I read through the 1st hash, then the 2nd, and then read through the entire 2nd hash after. It should only contain 1 entry in hash2, but it now contains 2 entries. How is the value James in hash2 getting assigned here?

my %hash1 = ();
my %hash2 = ();

$hash1{"James"}{"1 Main Street"}++;
$hash1{"John"}{"2 Elm Street"}++;
$hash2{"John"}{"3 Oak Street"}++;

foreach my $name (keys %hash1) {
 print "Hash1  Name $name\n";
 foreach my $address (keys %{$hash1{$name}}) {
   print "Hash1  Address $address\n";
   foreach my $address (keys %{$hash2{$name}}) {
     print "Hash2  Address $address\n";
   }
 } 
}

print "\n";
foreach my $name (keys %hash2) {
 print "Hash2  Name $name\n";
 foreach my $address (keys %{$hash2{$name}}) {
   print "Hash2  Address $address\n";
 }
}

output looks like this:

Hash1  Name James
Hash1  Address 1 Main Street
Hash1  Name John
Hash1  Address 2 Elm Street
Hash2  Address 3 Oak Street

Hash2  Name James
Hash2  Name John
Hash2  Address 3 Oak Street
  • 2
    Applying hash de-reference `%{}` on the non-existing value `$hash2{'James'}` triggers [autovivification](https://perldoc.perl.org/perlref.html#Using-References) of an empty hash reference and hence also adds the key `James` to `%hash2`. – Stefan Becker Feb 20 '19 at 19:49
  • @StefanBecker not all dereferences, just those that are considered updatish. Which perhaps unfortunately has always included keys – ysth Feb 20 '19 at 20:48
  • Actaully, you create 4 hash elements without assigning to them: `$hash1{"James"}`, `$hash1{"John"}`, `$hash2{"James"}` and `$hash2{"John"}`. This is all due to autovivification. Just like `$hash1{"James"}{"1 Main Street"}` in lvalue context is short for `${ $hash1{"James"} //= {} }{"1 Main Street"}`, `%{$hash2{$name}}` in lvalue context is short for `%{ $hash2{$name} //= {} }`. – ikegami Feb 20 '19 at 23:45

2 Answers2

1

The second value is being created when you are trying to read non-existan key from hash 2.

my %hash1 = ();
my %hash2 = ();

$hash1{"James"}{"1 Main Street"}++;
$hash1{"John"}{"2 Elm Street"}++;
$hash2{"John"}{"3 Oak Street"}++;

foreach my $name (keys %hash1) {
 print "Hash1  Name $name\n";
 foreach my $address (keys %{$hash1{$name}}) {
   print "Hash1  Address $address\n";
   next unless exists $hash2{$name}; # check if the key exists in second hash before trying to use the key in $hash2
   foreach my $address (keys %{$hash2{$name}}) { #second value gets created here
     print "Hash2  Address $address\n";
   }
 } 
}

print "\n";
foreach my $name (keys %hash2) {
 print "Hash2  Name $name\n";
 foreach my $address (keys %{$hash2{$name}}) {
   print "Hash2  Address $address\n";
 }
}
Andrey
  • 1,808
  • 1
  • 16
  • 28
1

When you used an undefined value as if it's a reference, Perl makes the reference sort that you wanted then tries to perform the operation. This is called "auto-vivification".

Here's a small demonstration. A variable starts out as undefined. You then treat it as an array reference (the dereference to get the 0th element):

use Data::Dumper;

my $empty;
print Dumper( $empty );

my $value = $empty->[0];
print Dumper( $empty );

Perl converts $empty to an array reference then tries to get the 0th element from that. You are left with an array reference where you formerly had undef:

$VAR1 = undef;
$VAR1 = [];

This is intended behavior.

Take it one step further. Put that undef inside an array and treat that element as if it's an array reference:

use Data::Dumper;

my @array = ( 1, undef, 'red' );
print Dumper( \@array );

my $value = $array[1]->[0];
print Dumper( \@array );

Now the second element is an empty array reference:

$VAR1 = [
          1,
          undef,
          'red'
        ];
$VAR1 = [
          1,
          [],
          'red'
        ];

Take it another step further. Don't store the undef value. Instead, access an array position past the last item in the array:

use Data::Dumper;

my @array = ( 1, 'red' );
print Dumper( \@array );

my $value = $array[2]->[0];
print Dumper( \@array );

Now you get an array reference element in your array. It's one element longer now:

$VAR1 = [
          1,
          'red'
        ];
$VAR1 = [
          1,
          'red',
          []
        ];

Had you gone further out (say, element 5), the interstitial elements up to the element you wanted would have been "filled in" with undef:

use Data::Dumper;

my @array = ( 1, 'red' );
print Dumper( \@array );

my $value = $array[5]->[0];
print Dumper( \@array );

$VAR1 = [
          1,
          'red'
        ];
$VAR1 = [
          1,
          'red',
          undef,
          undef,
          undef,
          []
        ];

A hash works the same way, and that's what you are seeing. When you want to check if there is a second-level key under James, Perl needs to create the James key and give it an empty hash ref value to it can check that. That second-level key is not there, but the first-level key of 'James' sticks around:

use Data::Dumper;

my %hash = (
    John => { Jay => '137' },
    );
print Dumper( \%hash );

if( exists $hash{James}{Jay} ) {
    print $hash{James}{Jay};
    }
print Dumper( \%hash );

Now you see an extra key:

$VAR1 = {
          'John' => {
                      'Jay' => '137'
                    }
        };
$VAR1 = {
          'James' => {},
          'John' => {
                      'Jay' => '137'
                    }
        };

In this case, you don't like this feature, but you can turn it off with the no autovivification pragma. It's a CPAN module that you need to install first:

no autovivification;
use Data::Dumper;

my %hash = (
    John => { Jay => '137' },
    );
print Dumper( \%hash );

if( exists $hash{James}{Jay} ) {
    print $hash{James}{Jay};
    }
print Dumper( \%hash );

You don't get the extra key:

$VAR1 = {
          'John' => {
                      'Jay' => '137'
                    }
        };
$VAR1 = {
          'John' => {
                      'Jay' => '137'
                    }
        };

You might also like to read How can I check if a key exists in a deep Perl hash?. I show a method that allows you to inspect a nested hash without creating intermediate levels.

brian d foy
  • 129,424
  • 31
  • 207
  • 592