7

I'm re-acquainting myself with Perl, and have just used module-starter to initialise a new project. I'm now trying to understand the generated code. All is fine apart from the follow line indicated :

sub not_in_file_ok {
  my ($filename, %regex) = @_;
  open( my $fh, '<', $filename )
    or die "couldn't open $filename for reading: $!";

  my %violated;

  while (my $line = <$fh>) {
    while (my ($desc, $regex) = each %regex) {
        if ($line =~ $regex) {
            ##I'm having problems here
            push @{$violated{$desc}||=[]}, $.;
        }
    }
  }
  ...
}

I have two problems:

  1. The ||=[]. Is this | followed by |=, or is this an or || followed by an =[]. Can someone talk me through what is happening here? (I'm guessing "if the hash is empty the create an empty anonymous array to initialise the hash", but I'm struggling to see how that is formed from the code.)
  2. push @{$violated{$desc}}, $. I understand this to mean "assign the line number to the key $desc for the hash %violated. But from the code I read, "lookup the value of the key desc of $violated{$desc} (the $violated{$desc} part), then use this value as a symbolic reference to an array (the @{$value} part), then push the line number onto that array". I don't see how to reconcile these two views.

I think there is a lot for me to learn in this line of code - can someone help me by walking me through it?

GEOCHET
  • 21,119
  • 15
  • 74
  • 98
Tom
  • 5,219
  • 2
  • 29
  • 45
  • 2
    By the way, thanks to autovivification, push `@{ $violated{$desc} ||= [] }, $.;` can be written `push @{ $violated{$desc} }, $.;` – ikegami Nov 21 '11 at 16:14

2 Answers2

9
  • ||=: this is an assignment operator. Example

    $a ||= $b;
    # corresponds to
    $a = $a || $b;
    

    see man perlop. In your example

    $a ||= [];
    # corresponds to
    $a = $a || [];
    

    that is: if the left operand is defined to nothing, otherwise assign an empty array reference

  • %violated contains an array reference for each value. You can see it like that:

    my $array_ref = $violated{$desc};
    push @{array_ref}, $.;
    

Written more verbosely:

  if (! $violated{$desc} ) {
      $violated{$desc} = [];
  }
  my $array_ref = $violated{$desc};
  push @{ $array_ref }, $.;

EDIT

Arrays and array references

  • an array constructed with () and contains a dynamic ordered list of elements (in Perl arrays can grow dynamically)

  • an array reference is a reference to an array (more or less a pointer without pointer arithmetic). You can create and array reference with []

Example

my @a = ( 1, 2, 3);
# $a[0] will contain 1

my $array_ref = [ 10, 11, 12 ];
# array_ref is a _pointer_ to an array containing 10, 11 and 12

To access an array reference you need to dereference it:

@{ $array_ref };

my @array = @{ $array_ref }; # is valid

You can access { $array_ref} as an array

${ $array_ref }[0]

Now back to your question in the comment: %violated is an hash with the following key-value pairs: a string ($desc) and an array reference

Matteo
  • 14,696
  • 9
  • 68
  • 106
  • Got it, thank you for your clear explanation (strangely, the `||=` operator is not in my first edition Programming Perl book, it seems I need to buy the new edition.) – Tom Nov 21 '11 at 14:55
  • @Tom It might be well hidden. In perlop it is only mentioned briefly in the "Assignment Operators" section. The man page explains `+=` and then just says that it works also for (||, **, &, *, ...) – Matteo Nov 21 '11 at 15:05
  • Spoke too soon, I'm still a little confused, If I have a hash %sound = (cat=>"meow"), then $sound{cat} is the scalar "meow", rather than an array reference? Why is $violated{$desc} an array reference? And if I then push a value to that array, how is it then found to be a scalar? – Tom Nov 21 '11 at 15:11
  • $sound{cat} can be anything. $sound{cat} = '1', contains a scalar. $sound{cat} = [ 1, 2, 3] contains an array reference – Matteo Nov 21 '11 at 15:14
  • Ah, yes, okay, I understand, we are making it an array. – Tom Nov 21 '11 at 15:16
3

Let's try to deconstruct this step-by-step:

  1. The line is used to populate a hash of arrayrefs, where the arrayrefs contain the line numbers where the $desc regex matches. The resultant %violated hash will look something like:

    ( desc1 => [ 1, 5, 7, 10 ], desc2 => [ 2, 3, 4, 6, 8 ] );

  2. push takes an array as its first argument. The variable $violated{$desc is an arrayref, not an array, so the @{...} is used to dereference it (dereferencing is the opposite of referencing).

  3. Now for the tricky part. The stuff inside the braces is just a fancy way of saying that if $violated{$desc} is not defined inside %violated (tested with ||), it is assigned (=) to an empty arrayref ([]). Think of it as two assignments in one line:

    $violated{$desc} = $violated{$desc} || [];

    push @{$violated{$desc}}, $.;

  4. Note that this complication isn't usually necessary, thanks to a feature called autovivification, which automatically creates previously undefined keys inside the hash with the intended context (an arrayref in this case). The only case I can think of where this would be needed is if $violated{$desc} == 0 before.

Zaid
  • 36,680
  • 16
  • 86
  • 155
  • `"meow"` is a single string. An array like `("meow", "purr")` contains multiple strings. Because a hash's key can only have a single value, we work around this by referencing the array. The reference is single, but refers to an array of many strings. I think [`perldoc perllol`](http://perldoc.perl.org/perllol.html) will help you understand this a little better. – Zaid Nov 21 '11 at 15:18
  • @Zaid, `("meow", "purr")` is not an array. Colloquially, it's a two-element list. Formally, it's a list Perl expression that returns a two element list or a scalar depending on context. – ikegami Nov 21 '11 at 18:40
  • @ikegami : Right. My bad, it's a list. And consequently my explanation has a hole in it. Because there's no such thing a list in scalar context, the last item in the list will be returned. – Zaid Nov 21 '11 at 19:06
  • @Zaid, One can't create a list value in scalar context, but a list literal can exist in scalar context. e.g. `my $x = ("a", "b");` – ikegami Nov 21 '11 at 20:41
  • @ikegami : `perl -Mstrict -wE 'my $x = ( "a", "b" ); say $x'` outputs `Useless use of a constant (a) in void context at -e line 1.` & `b`, so the last item in the list is assigned to `$x`. Is that what you meant? – Zaid Nov 22 '11 at 05:59
  • @Zaid, I meant: It's wrong to say "there's no such thing as a list in scalar context." Ambiguities make the claim both true and false. The following is an unambiguous phrasing: "a list cannot be returned in scalar context". – ikegami Nov 22 '11 at 06:59
  • @ikegami : The reason why I made that statement is because [`perlfaq4` ](http://perldoc.perl.org/perlfaq4.html#What-is-the-difference-between-a-list-and-an-array%3f) classifies it as a list-lookalike, not a list: `In effect, that list-lookalike assigns to $scalar it's rightmost value.` – Zaid Nov 22 '11 at 08:30
  • @Zaid, Which statement? First you said it was an array, then you said there's no such thing as a list in scalar context. Both are wrong. As for perlfaq4, it's blatantly lying. That *does* result in a list operator, so it *is* a list literal. – ikegami Nov 22 '11 at 18:25
  • @ikegami : I am surprised that something like this is not clearly mentioned in `perlfaq4`. I've raised this point in [another question](http://stackoverflow.com/q/8232951/133939) because I feel it deserves attention. – Zaid Nov 22 '11 at 20:00