22

I have a section in my code where I know I will need an array, and I know exactly how many elements that array will need to have. This section of code will be repeated a lot, so I'd could get some very big time savings by first initializing that array to the size I know it will need and then filling it up vs just pushing items on (pushing would be O(n) as opposed to filling up already created spaces, which would be O(1)).

That said, I can't seem to find any elegant way of initializing an array to a given size, and I have no idea why. I know I can do:

my @array; $array[49] =0;

to get a 50 item array, but that looks really ugly to me and I feel as though there must be a better way. Ideas?

Eli
  • 36,793
  • 40
  • 144
  • 207
  • 4
    Nothing wrong with preallocating arrays, but this smells like a premature and micro-optimization. Why do you think `push` is O(n)? – mob Jan 20 '11 at 19:48
  • No time to find a link right now, but long story short: pushing requires looping through the array to find the last open space (that's the way push is usually implemented, anyway, though now that I'm thinking about it, that's for a singly linked list rather than an array normally). Are you saying it's implemented differently in Perl? – Eli Jan 20 '11 at 19:59
  • 7
    Yeah, it's a little different. Perl lists are arrays with some slack at both the front and the back. They also know their size, so both `push` and `unshift` are usually O(1). If the slack gets all used up, Perl will reallocate a new array with more space. Reallocation is an O(n) operation but it only needs to happens after every ~log(n) push operations. – mob Jan 20 '11 at 20:29
  • 1
    See also: [Constant Amortized Time](http://stackoverflow.com/questions/200384) – mob Jan 20 '11 at 20:48
  • Makes sense. So that makes the whole push operation still O(n) (though closer to O(logn) in practice). That's not as bad as I originally thought, but I think O(1) should still be significantly better, considering this is code I wind up reusing very frequently. Anyway, thanks for the edification re: Perl arrays! EDIT: Just looked at Brian's tests. Now I'm confused again. How does Perl wind up making push faster than setting? – Eli Jan 20 '11 at 21:05
  • I didn't explain dynamic array reallocation as well as I could have. It makes *n* push operations O(*n*). See the link about constant amortized time. – mob Jan 20 '11 at 22:29
  • 1
    Just curious, Eli, what's your background in programming? – AmbroseChapel Jan 20 '11 at 22:48
  • push is not useful if order matters, and the order in which elements become available is not the order in which you want them in the array. – user1067305 Dec 23 '20 at 22:39

7 Answers7

17

To be honest your way is perfectly fine, as is explicitly changing the size of the array: $#array = 49;;

DVK
  • 126,886
  • 32
  • 213
  • 327
13
  1. The first rule of Optimization Club is, you do not Optimize.
  2. The second rule of Optimization Club is, you do not Optimize without measuring.

Measure, measure, measure before you go and assume that you can do it faster by faking out Perl. Perl's been doing the optimization of common usage a lot longer than you have. Trust it.

Andy Lester
  • 91,102
  • 13
  • 100
  • 152
  • 2
    This "rule" has always been abused. "Premature optimization is the root of all evil" has become "Optimization is the root of all evil." Therefore, optimization should be avoided. As a result, those engineers do not consider application performance during the design of the software, when it is critical to do so. The original quote was primarily about microoptimizations. – oᴉɹǝɥɔ Dec 11 '21 at 20:29
6

Whenever you're thinking about doing this type of optimization, do some profiling! The result may not be what you expect. For instance, I used the following quick script to test your theory that pre-allocating the array is faster:

for ( my $loops = 0; $loops < 100000; $loops++ )
{
    my @arr;

    for ( my $foo = 0; $foo < 50; $foo++ ) {
        push @arr, 'bar';
    }
}

That took 2.13 seconds.

for ( my $loops = 0; $loops < 100000; $loops++ )
{
    my @arr;
    $arr[49] = 0;

    for ( my $foo = 0; $foo < 50; $foo++ ) {
        $arr[$foo] = 'bar';
    }
}

That took 2.16 seconds (I ran both tests several times). So it actually ends up being faster to just let perl handle allocating the array as necessary.

Update

After making changes suggested by ysth, the numbers make a bit more sense: 2.27 seconds for the "push" method, and 2.21 for pre-allocation. Even so, I would question whether such an optimization would really save any time (the difference was only 0.06 seconds after 100,000 iterations).

Brian
  • 15,599
  • 4
  • 46
  • 63
  • @arr is reused from one iteration to the next; to force it to not be reused, say `my $arrref;` before the outer loop and `$arrref=\@arr;` at the end of the outer loop. – ysth Jan 20 '11 at 21:55
  • Updated my answer to reflect ysth's suggestion. I suppose it's a good way to get a more objective result, but would such a thing be done in real-world usage? – Brian Jan 20 '11 at 22:21
2

Instead of a specific value use undef

my @array;
$array[49] = undef;
metdos
  • 13,411
  • 17
  • 77
  • 120
1

Preallocating might not help much with the speed, but it might help with returning the memory to the system if the chunks allocated are big enough

Emil Perhinschi
  • 1,185
  • 11
  • 14
1

Your way is great, and so is DVK's. A way to do it in a single command might be:

@array = (0 .. 49);

But I'm not sure if it's more elegant, since it assigns a value between 1 and 49 to each element, but it's probably more intuitive to understand for a programmer not much into Perl's syntax.

syck
  • 2,984
  • 1
  • 13
  • 23
cbrandolino
  • 5,873
  • 2
  • 19
  • 27
0

An advantage of predefining the array length is that items can then be filled in any convenient order, rather than in strict order, as required by push. That is much less error prone.

user1067305
  • 3,233
  • 7
  • 24
  • 29