Should I use $_[0] or copy the argument list in Perl?

Question

If I pass a hash to a sub:

parse(\%data);

Should I use a variable to $_[0] first or is it okay to keep accessing $_[0] whenever I want to get an element from the hash? clarification:

sub parse
{    $var1 = $_[0]->{'elem1'};
     $var2 = $_[0]->{'elem2'};
     $var3 = $_[0]->{'elem3'};
     $var4 = $_[0]->{'elem4'};
     $var5 = $_[0]->{'elem5'};
}
# Versus
sub parse
{    my $hr = $_[0];
     $var1 = $hr->{'elem1'};
     $var2 = $hr->{'elem2'};
     $var3 = $hr->{'elem3'};
     $var4 = $hr->{'elem4'};
     $var5 = $hr->{'elem5'};
}

Is the second version more correct since it doesn't have to keep accessing the argument array, or does Perl end up interpereting them the same way anyhow?

score 11 · Accepted Answer · edited Nov 18 '09 at 19:28

In this case there is no difference because you are passing reference to hash. But in case of passing scalar there will be difference:

sub rtrim {
    ## remove tailing spaces from first argument
    $_[0] =~ s/\s+$//;
}
rtrim($str); ## value of the variable will be changed

sub rtrim_bugged {
    my $str = $_[0]; ## this makes a copy of variable
    $str =~ s/\s+$//;
}
rtrim($str); ## value of the variable will stay the same

If you're passing hash reference, then only copy of reference is created. But the hash itself will be the same. So if you care about code readability then I suggest you to create a variable for all your parameters. For example:

sub parse {
     ## you can easily add new parameters to this function
     my ($hr) = @_; 

     my $var1 = $hr->{'elem1'};
     my $var2 = $hr->{'elem2'};
     my $var3 = $hr->{'elem3'};
     my $var4 = $hr->{'elem4'};
     my $var5 = $hr->{'elem5'};
}

Also more descriptive variable names will improve your code too.

score 7 · Answer 2 · edited May 23 '17 at 12:07

7

For a general discussion of the efficiency of shift vs accessing @_ directly, see:

As for your specific code, I'd use shift, but simplify the data extraction with a hash slice:

sub parse
{
    my $hr = shift;
    my ($var1, $var2, $var3, $var4, $var5) = @{$hr}{qw(elem1 elem2 elem3 elem4 elem5)};
}

I'll assume that this method does something else with these variables that makes it worthwhile to keep them in separate variables (perhaps the hash is read-only, and you need to make some modifications before inserting them into some other data?) -- otherwise why not just leave them in the hashref where they started?

edited May 23 '17 at 12:07

Community

1
1

answered Nov 18 '09 at 19:32

Ether

53,118
13
86
159

A good reason to put the items into variables is for the typo protection `strict 'vars'` provides. You can also get the same benefit from locking your hash keys with `Hash::Util::lock_keys`. – daotoad Nov 18 '09 at 20:09
It isn't at all what I really do in my function, I was just giving an example in the question. – user105033 Nov 18 '09 at 21:07
1

@OP: ok, that's what I figured. :) (PS. you can give yourself a nicer user name by going to your profile page - http://stackoverflow.com/users/105033/) – Ether Nov 18 '09 at 21:09
2

With one exception, I prefer to avoid shifting `@_` and do a list assignment from it instead -- even for one argument, because it means that I won't introduce a bug by accidentally turning `my $one = shift` into `my ($one, $two) = shift`. The exception is in OO code (especially constructors, or functions that delegate or redispatch), where shifting `$self` makes things much simpler. – hobbs Nov 19 '09 at 08:04

ysth · Answer 3 · 2009-11-18T21:28:36.933

You are micro-optimizing; try to avoid that. Go with whatever is most readable/maintainable. Usually this would be the one where you use a lexical variable, since its name indicates its purpose...but if you use a name like $data or $x this obviously doesn't apply.

In terms of the technical details, for most purposes you can estimate the time taken by counting the number of basic ops perl will use. For your $_[0], an element lookup in a non-lexical array variable takes multiple ops: one to get the glob, one to get the array part of the glob, one or more to get the index (just one for a constant), and one to look up the element. $hr, on the other hand is a single op. To cater to direct users of @_, there's an optimization that reduces the ops for $_[0] to a single combined op (when the index is between 0 and 255 inclusive), but it isn't used in your case because the hash-deref context requires an additional flag on the array element lookup (to support autovivification) and that flag isn't supported by the optimized op.

In summary, using a lexical is going to be both more readable and (if you using it more than once) imperceptibly faster.

score 4 · Answer 4 · answered Nov 18 '09 at 19:32

4

My rule is that I try not to use $_[0] in subroutines that are longer than a couple of statements. After that everything gets a user-defined variable.

Why are you copying all of the hash values into variables? Just leave them in the hash where they belong. That's a much better optimization than the one you are thinking about.

answered Nov 18 '09 at 19:32

brian d foy

129,424
31
207
592

1

I only use `$_[0]` in two situations: class accessors (which meet your "couple of statements" rule) and for the rare case where I want to modify arguments in place and need to preserve aliasing. – Michael Carman Nov 18 '09 at 20:49
@Michael Carman: if you want, you can avoid using `$_[0]` and explicitly document your in-place modification like this: `perl -Mstrict -wle'sub arg_aliases { \@_ } sub foo { my $arg_aliases = &arg_aliases; my ($foo) = @_; print "foo was $foo"; $arg_aliases->[0] = "plugh" } my $foo = "xyzzy"; foo($foo); print "foo is now $foo"'` – ysth Nov 19 '09 at 01:38
1

@ysth, couldn't you just take refs into `@_`? Something like `sub foo { my $arg1 = \$_[0]; $$arg1 =~ s/foo/bar/; }` That avoids the extra function call. Or to get the whole array, `my @arg_refs = map {\$_} @_;` – daotoad Nov 19 '09 at 06:00

score 0 · Answer 5 · answered Nov 18 '09 at 18:06

0

Its the same although the second is more clear

answered Nov 18 '09 at 18:06

ennuikiller

46,381
14
112
137

dlamblin · Answer 6 · 2009-11-19T04:10:13.067

0

Since they work, both are fine, the common practice is to shift off parameters.

sub parse { my $hr = shift; my $var1 = $hr->{'elem1'}; }

edited Nov 19 '09 at 04:10

answered Nov 18 '09 at 18:06

dlamblin

43,965
20
101
140

3

I wouldn't say "this is the common practice." It's practically a Perl-only holy war. – Chris Lutz Nov 18 '09 at 18:07
1

my ($one,$two) = ((shift),(shift)); OR my ($one,$two) = @_; ? – user105033 Nov 18 '09 at 18:16
4

`$hr->{elem1}` or `$$hr{elem1}` are groovy. `$$hr->{elem1}` won't work. – friedo Nov 18 '09 at 19:30
I'm trying to slap myself every time I use shift. It's a really bad habit, and I hardly ever really want to change @_. – brian d foy Nov 18 '09 at 21:59
1

Okay, "clarified" time: shift is great for one variable. my ($a, $b, $c) = @_ is great for multiple variables. Sorry for any confusion. – dlamblin Nov 19 '09 at 04:09

Should I use $_[0] or copy the argument list in Perl?

6 Answers6