4

I am confused with the following.
Sometimes I see examples such as this:

my %hash = get_data();

sub get_data {
    my %data = ();
    # do processing
    return %data;
}

And similar with arrays.

my @arrays = get_data();

sub get_data {
    my @data = ();
    # do processing
    return @data;
}

I originally thought that you can not return arrays or hashes from functions only references to them.
So I don't understand what is the difference and when should we prefer over the other?
Is it related to garbage collection or too much copy of data what we choose?

Miller
  • 34,962
  • 4
  • 39
  • 60
Jim
  • 18,826
  • 34
  • 135
  • 254
  • 3
    Function always return list. In case of `return @data` it is the list of `@data` elements, in case of `return %data` it is the list of key/value `%data` elements, and in case of `\%data` it is list of one element (hash reference). You can feed such list into hash as you did, into array, etc. – mpapec Sep 10 '14 at 12:38
  • @mpapec:So what's the difference?An array is just a list object isn't it? And hash is also expressed as a list.I think I am missing something important. Is the code e.g. pasted wrong? – Jim Sep 10 '14 at 12:43
  • 3
    Arrays and hashes are *containers*, while you can think of lists as transient, or on the fly structures. [Link1](http://stackoverflow.com/questions/6023821/perl-array-vs-list) and [link2](http://friedo.com/blog/2013/07/arrays-vs-lists-in-perl) – mpapec Sep 10 '14 at 12:47
  • @mpapec:So the code in the OP does not have any hidden bugs? – Jim Sep 10 '14 at 12:49
  • No they don't. You could even be returning `@data` in the first case and feed it into hash, but then you should make sure that `@data` has even number of elements. – mpapec Sep 10 '14 at 12:52
  • @mpapec:How about performance?Does it make a new copy each time? – Jim Sep 10 '14 at 12:53
  • @Jim : Check out this !! http://stackoverflow.com/a/1817614/3965075 . – Praveen Sep 10 '14 at 12:55
  • Returning reference should be more efficient but benchmark with various array sizes would be required. – mpapec Sep 10 '14 at 12:59

2 Answers2

13

Strictly speaking, you can't return an array or a hash from a Perl subroutine. Perl subroutines return lists. Lists are similar to arrays in that they're sequences of values, but they aren't arrays. Arrays are variables. Lists are nameless, immutable, and transient data structures used to pass and return values, initialize arrays and hashes, etc. It's a somewhat subtle point, but an important one.

When you write return @data you aren't returning the @data array; you're returning a list of the values it contains. Similarly, return %data returns a list of the key/value pairs contained in the hash. Those values can be used to initialize another array or hash, which is what's happening in your examples. The initialized array/hash contains a (shallow) copy of the one used by the subroutine.

To "return" an array or hash, you must return a reference to it. e.g. return \@data or return \%data. Doing that returns a reference to the variable itself. Modifying it will affect the original array as well because it's the same storage.

Whether a sub should return an array/hash as a list (copy) or a reference is a programming decision. For subs that always return N values with positional meaning (e.g. the localtime built-in) returning a list makes sense. For subs that return arbitrarily large lists it's usually better to return a reference instead as it's more efficient.

It's even possible for a sub to decide what to return based on how it's called by using wantarray. This let's the caller decide what they want.

sub get_data {
    my @data;
    ...
    return wantarray ? @data : \@data;
}

my $aref  = get_data(); # returns a reference
my @array = get_data(); # returns a list
Michael Carman
  • 30,628
  • 10
  • 74
  • 122
  • What about GC?If we return a reference from a function that will not be freed (memory I mean) as long as someone holds the reference right? – Jim Sep 10 '14 at 13:53
  • 2
    @Jim: Correct. By returning a reference you get a single copy of the array which persists as long as there's at least one reference to it. Returning a list results in multiple copies of the array but the sub's copy is eligible for GC immediately. – Michael Carman Sep 10 '14 at 14:03
4

You're actually creating a new array (or hash), filled with the same elements as the one generated in the sub:

sub get_data{
    # initialize an array
    my @toReturn = qw/ a b c d e f g /;

    # get its location in memory
    my $toReturn_ref = \@toReturn;

    # print its location in memory
    print "toReturn: $toReturn_ref\n";

    # return the **elements** in the array (not the array itself)
    return @toReturn;
}

# initialize an array
my @arr = get_data();

# get its location in memory
my $arr_ref = \@arr;

# print its location in memory
print "\"Returned\": $arr_ref\n";

This will print something like:

toReturn:   ARRAY(0x1df85e8)
"Returned": ARRAY(0x1debc40)

They're different arrays, but happen to have the same content.

ajwood
  • 18,227
  • 15
  • 61
  • 104