4

I have decided to give Perl a try and I have stumbled across a language structure that seems to be valid, but I just can't believe it is. As I guess there is some rationale behind this I decided to ask a question.

Take a following Perl code:

%data = ('John Paul' => ('Age' => 45), 'Lisa' => 30);
print "\$data{'John Paul'} = $data{'John Paul'}{'Age'}\n";
print "\$data{'Lisa'} = $data{'Lisa'}\n";

My intention was to check how hash of hashes works. The above code prints:

$data{'John Paul'} =
$data{'Lisa'} =

To make it a valid hash of hashes one needs:

%data = ('John Paul' => {'Age' => 45}, 'Lisa' => 30);

and the result would be:

$data{'John Paul'} = 45
$data{'Lisa'} = 30

Does anyone know:

  1. Why there is non uniformity and the internal hash needs {} instead of ()?
  2. Why do I get no error or warning that something is wrong when there is () instead of {} for the internal hash? It is very easy to do such kind of mistakes. What is more, ('Age' => 45) breaks not only the value for 'John Paul' but also for 'Lisa'. I just can't imagine searching for such kind of "bugs" in project with thousands lines of code.
Al Bundy
  • 653
  • 1
  • 6
  • 22
  • Parenthesis only denote list grouping and precedence in Perl. They never create anything. e.g. `(1, 2, 3)` is identical to `(1, (2, 3)` is identical to `(((1), ((2, (3)))))`, etc. The only real functional thing they do is start a list. – lordadmira Mar 12 '21 at 23:33
  • To verify what you put into your data structure, dump it out with Data::Dump (among many other options). `use Data::Dump "pp"; pp \%data;` – lordadmira Mar 12 '21 at 23:55

2 Answers2

4
( 'John Paul' => ( 'Age' => 45 ), 'Lisa' => 30 )

is just another way of writing

'John Paul', 'Age', 45, 'Lisa', 30

Parens don't create any data structure; they just affect precedence like in (3+4)*5. The reason we don't write

my %h = a => 4;

or the equivalent

my %h = 'a', 4;

is that it would be interpreted as

( my %h = 'a' ), 4;

What creates the hash is my %data, not the parens. The right-hand side of the assignment just places an arbitrary number of scalars on the stack, not a hash. The assignment operator adds these scalars to the hash.


But sometimes, we want to create an anonymous hash. This is where {} comes in.

my %data = ( 'John Paul' => { 'Age' => 45 }, 'Lisa' => 30 );

is basically equivalent to

my %anon = ( 'Age' => 45 );
my %data = ( 'John Paul' => \%anon, 'Lisa' => 30 );

Note that \%anon returns a scalar, a reference to a hash. This is fundamentally different than what ( 'John Paul' => \%anon, 'Lisa' => 30 ) and 'John Paul' => \%anon, 'Lisa' => 30 return, four scalars.


Why there is non uniformity and the internal hash needs {} instead of ()?

An underlying premise of this question is false: Hashes don't need (). For example, the following are perfectly valid:

my %h1 = 'm'..'p';
sub f { return x => 4, y => 5 }
my %h2 = f();
my %h3 = do { i => 6, j => 7 };

() has nothing to do with hashes. The lack of uniformity comes from the lack of parallel. One uses {} to create a hash. One uses () to override precedence.

Since parens just affect precedence, one could use

my %data = ( 'John Paul' => ({ 'Age' => 45 }), 'Lisa' => 30 );  # ok (but weird)

This is very different than the following:

my %data = ( 'John Paul' => ( 'Age' => 45 ), 'Lisa' => 30 );  # XXX

Why do I get no error or warning that something is wrong when there is () instead of {} for the internal hash?

Not only is using () valid, using () around expressions that contain commas is commonly needed. So when exactly should it warn? The point is that it's arguable whether this should be a warning or something perlcritic finds, at least at first glance. The latter should definitely find this, but I wouldn't know if a rule for it exists or not.

ikegami
  • 367,544
  • 15
  • 269
  • 518
  • Minor nit. `my %data` doesn't create the hash per se, it tells Perl what kind of hash to create in terms of scoping and storage. There are various ways of defining a hash that all result a hash being created and you can create an anonymous hash that is never associated with a hash variable. A hash is created by referring to it. – lordadmira Mar 12 '21 at 23:47
  • @lordadmira, Re "*`my %data` doesn't create the hash per se*", It does. `my` has a compile-time and a run-time effect. The compile-time effect of `my %hash` is to create a hash, and its run-time effect is to cause the hash to be cleared or replaced with a new hash on scope exit. The fact that you can create an anonymous hash using `{ }`, or that you can create a global hash by simply mentioning its name doesn't change that. – ikegami Mar 12 '21 at 23:50
  • Why would it create a variable at compile time only to throw it away and make a new one once the scope is entered?? The point of `my()` is to make Perl autovivify the variable on the lexical pad when the statement is encountered and then to capture all references to that symbol as referring to the pad. The only compile time effect is to legalize the symbol to `strict`. – lordadmira Mar 13 '21 at 00:33
  • @lordadmira, Re "*Why would it create a variable at compile time only to throw it away and make a new one once the scope is entered??*", It doesn't. I said it can create a new hash when the scope is *exited*. /// Re "*The point of my() is to make Perl autovivify*", You don't seem to understand the meaning of "auto". If you explicitly use an instruction (`my`) to do it, it's not automatic. Autovivification is the automatic creation of variables and references to them when you dereference an undefined variable. – ikegami Mar 13 '21 at 00:39
2

Why there is non uniformity and the internal hash needs {} instead of ()?

An assignment to a hash is a list of scalars (alternating between keys and values).

You can't have a hash (because it isn't a scalar) as a value there, but you can have a hash reference.

Lists get flattened.

Why do I get no error or warning that something is wrong when there is () instead of {} for the internal hash?

Because you didn't turn them on with the use strict; use warnings; pragmas (which are off by default for reasons of horrible backwards compatibility but which will be on by default in Perl 7).

Quentin
  • 914,110
  • 126
  • 1,211
  • 1,335
  • And what about breaking the value for `'Lisa'`? – Al Bundy Mar 10 '21 at 09:13
  • `{a=>b}` one scalar value `(a=>b)` two scalar values... See https://perldoc.perl.org/perlreftut – clamp Mar 10 '21 at 09:14
  • @AlBundy — Lisa doesn't have a value, Lists get flattened. Lisa is the value of the key 45. – Quentin Mar 10 '21 at 10:14
  • Neither strict nor warnings catches this (though they do catch the odd number of elements in the specific example given). A `perlcritic` rule probably exists, though. (If not, one could be created.) – ikegami Mar 10 '21 at 10:31
  • If you are new to Perl, it is much better to put `use diagnostics;`. You'll get a very nice explanation of whatever it thinks is wrong. – lordadmira Mar 12 '21 at 23:57