Local variable visibility in closures vs. local `sub`s

Question

Perl 5.18.2 accepts "local subroutines", it seems.

Example:

sub outer()
{
    my $x = 'x';   # just to make a simple example

    sub inner($)
    {
        print "${x}$_[0]\n";
    }

    inner('foo');
}

Without "local subroutines" I would have written:

#...
    my $inner = sub ($) {
        print "${x}$_[0]\n";
    }

    $inner->('foo');
#...

And most importantly I would consider both to be equivalent.

However the first variant does not work as Perl complains:

Variable $x is not available at ...

where ... describes the line there $x is referenced in the "local subroutine".

Who can explain this; are Perl's local subroutines fundamentally different from Pascal's local subroutines?

I think the inner sub is stored in the global name space, see [this](https://stackoverflow.com/a/10192547/2173773) answer, if you like to use [lexical subs](https://perldoc.perl.org/perlsub#Lexical-Subroutines) you should prefixing the sub with a `my` according to the documentation — Håkon Hægland, Jun 17 '22 at 08:47
This "_local subroutines_" term seems to be referring to [lexical subroutines](https://perldoc.perl.org/perlsub#Lexical-Subroutines) ? If that is so, they need the `my` in the definition, `my sub name { ... }` -- then it works as intended. (It also works with a pre-declaration: `my sub name;` then later `sub name { ... }`.) Is this what you are looking into? — zdim, Jun 17 '22 at 08:56

zdim · Accepted Answer · 2022-06-21T05:43:17.077

The term "local subroutine" in the question seems to be referring to lexical subroutines. These are private subroutines visible only within the scope (block) where they are defined, after the definition; just like private variables.

But they are defined (or pre-declared) with my or state, as my sub subname { ... }

Just writing a sub subname { ... } inside of another doesn't make it "local" (in any version of Perl), but it is compiled just as if it were written alongside that other subroutine and is placed in their package's symbol table (main:: for example).

The question mentions closure in the title and here is a comment on that

A closure in Perl is a structure in a program, normally a scalar variable, with a reference to a sub and which carries environment (variables) from its scope at its (runtime) creation. See also a perlfaq7 entry on it. Messy to explain. For example:

sub gen { 
    my $args = "@_"; 

    my $cr = sub { say "Closed over: $args, my args: @_" }
    return $cr;
}

my $f = gen( qw(args for gen) );

$f->("hi closed");
# Prints:
# Closed over: args for gen, my args: hi closed

The anonymous sub "closes over" the variables in scope where it's defined, in a sense that when its generating function returns its reference and goes out of scope those variables still live on, because of the existence of that reference. Since anonymous subs are created at runtime, every time its generating function is called and lexicals in it remade so is the anon sub, so it always has access to current values. Thus the returned reference to the anon-sub uses lexical data, which would otherwise be gone. A little piece of magic.^†

Back to the question of "local" subs. If we want to introduce actual closures to the question, we'd need to return a code reference from the outer subroutine, like

sub outer {
    my $x = 'x' . "@_";
    return sub { say "$x @_" }
}
my $f = outer("args");
$f->( qw(code ref) );   # prints:  xargs code ref

Or, per the main question, as introduced in v5.18.0 and stable from v5.26.0, we can use a named lexical (truly nested!) subroutine

sub outer {
    my $x = 'x' . "@_";
    
    my sub inner { say "$x @_" };

    return \&inner;
}

In both cases my $f = outer(...); has the code reference returned from outer which correctly uses the local lexical variables ($x), with their most current values.

But we cannot use a plain named sub inside outer for a closure

sub outer {
    ...

    sub inner { ... }  # misleading, likely misguided and buggy

    return \&inner;    # won't work correctly
}

This inner is made at compile time and is global so any variables it uses from outer will have their values baked from when outer was called the first time. So inner will be correct only until outer is called the next time -- when the lexical environment in outer gets remade but inner doesn't. As an example I can readily find this post, and see the entry in perldiag (or add use diagnostics; to the program).

^† And in my view a poor-man's object in a way, as it has functionality and data, made elsewhere at another time and which can be used with data passed to it (and both can be updated)

Can you give an example of where this would be useful? I've been thinking about this lately, but can't come up with one. — simbabque, Jun 17 '22 at 09:27
@simbabque You mean the lexical subs? I can really only think of just being able to nicely organize code in a larger sub. (Even though we did have anon subs for that.) But then in docs they bring up recursions as well. I am not sure of any specific and distinct advantage of having it. Good question. — zdim, Jun 17 '22 at 09:34
I think for recursion, `__SUB__` is probably fine. And hiding private methods doesn't really make sense with it either. It's just a bit odd. — simbabque, Jun 17 '22 at 09:42
Most answers seem to concentrate on the fact whether the subroutine is actually local or not, **but** the real question was why that subroutine cannot access the `my` variable `$x`. — U. Windl, Jun 17 '22 at 10:36
@U.Windl Added a discussion of closures, which the title mentions. Will probably edit — zdim, Jun 18 '22 at 04:27
I wonder: Is it correct to say that in Perl every `sub` is a closure? Also: Isn't a closure a `CODE` reference, and it depends to which name the reference is associated with (anonymous vs. named `sub`s)? — U. Windl, Jun 20 '22 at 08:45
@U.Windl A closure is indeed code, such that it packs lexical environment (variables) from another scope, in which it was created. So in some scope (function) a sub is made _which uses variables from that scope_ and a reference (pointer) to it is returned. Then values of those variables stay available to it (If they weren't used by the sub they'd be gone). This original lexical environment that the returned function keeps is a crucial part of what one calls "closure." So we can't really call any old sub that, but any sub can create that. Not every language has such capability. — zdim, Jun 21 '22 at 05:12
@U.Windl Now if you mean that any sub of course uses its lexicals so it is technically a closure -- the values that it uses exist as lexical variables at the time the sub runs -- they aren't its environment (its outer scope) at the time of its creation some place and time else -- but they rather belong to it, to the same scope. So I still wouldn't say we can call it that (but it's getting murkier :) — zdim, Jun 21 '22 at 05:14
@U.Windl As for names (a good point) -- given how the named subs in Perl are created, we can't use a named sub for that because it is made at compile time and is a global (in the symbol table). So if we place its code inside some scope (another function), the variables that it does see defined before itself (in that scope) get baked into it when it's compiled. It's never remade so it can't account for a changed environment in the function it which its code is placed, as the function is called again and again. So we need a lexical there, made dynamically when a sub runs. [...] — zdim, Jun 21 '22 at 05:28
[...] an anon sub. So we get a code reference, which can be assigned to a lexical or returned etc. Or, as of 5.18/5.26 we can have that "lexical subroutine" which also gets made at runtime, every time its surrounding function runs. But which _is_ named, what may be nice and convenient (or perhaps not so much?) — zdim, Jun 21 '22 at 05:30

ikegami · Answer 2 · 2022-06-17T17:24:57.727

If you want "local" subs, you can use one of the following based on the level of backward compatibility you want:

5.26+:
```
my sub inner { ... }
```

5.18+:

use experimental qw( lexical_subs );  # Safe: Accepted in 5.26.

my sub inner { ... }

"Any" version:
```
local *inner = sub { ... };
```

However, you should not, use sub inner { ... }.

sub f { ... }

is basically the same as

BEGIN { *f = sub { ... } }

so

sub outer {
   ...

   sub inner { ... }

   ...
}

is basically

BEGIN {
   *outer = sub {
      ...

      BEGIN {
         *inner = sub { ... };
      }

      ...
   };
}

As you can see, inner is visible even outside of outer, so it's not "local" at all.

And as you can see, the assignment to *inner is done at compile-time, which introduces another major problem.

use strict;
use warnings;
use feature qw( say );

sub outer {
   my $arg = shift;

   sub inner {
      say $arg;
   }

   inner();
}

outer( 123 );
outer( 456 );

Variable "$arg" will not stay shared at a.pl line 9.
123
123

5.18 did introduce lexical ("local") subroutines.

use strict;
use warnings;
use feature qw( say );
use experimental qw( lexical_subs );  # Safe: Accepted in 5.26.

sub outer {
   my $arg = shift;

   my sub inner {
      say $arg;
   };

   inner();
}

outer( 123 );
outer( 456 );

123
456

If you need to support older versions of Perl, you can use the following:

use strict;
use warnings;
use feature qw( say );

sub outer {
   my $arg = shift;

   local *inner = sub {
      say $arg;
   };

   inner();
}

outer( 123 );
outer( 456 );

123
456

score 0 · Answer 3 · answered Jun 17 '22 at 10:56

I found a rather good explanation from man perldiag:

       Variable "%s" is not available
           (W closure) During compilation, an inner named subroutine or eval
           is attempting to capture an outer lexical that is not currently
           available.  This can happen for one of two reasons.  First, the
           outer lexical may be declared in an outer anonymous subroutine
           that has not yet been created.  (Remember that named subs are
           created at compile time, while anonymous subs are created at run-
           time.)  For example,

               sub { my $a; sub f { $a } }

           At the time that f is created, it can't capture the current value
           of $a, since the anonymous subroutine hasn't been created yet.

So this would be a possible fix:

sub outer()
{
    my $x = 'x';   # just to make a simple example

    eval 'sub inner($)
    {
        print "${x}$_[0]\n";
    }';

    inner('foo');;
}

...while this one won't:

sub outer()
{
    my $x = 'x';   # just to make a simple example

    eval {
        sub inner($)
        {
            print "${x}$_[0]\n";
        }
    };

    inner('foo');;
}

Local variable visibility in closures vs. local `sub`s

3 Answers3