What is the point of collection capacity in Objective-C?

Question

I have been using Objective-C recently, and, coming from the C++ world, I don't get the point of specifying a capacity for the native Objective-C collections.

In C++, containers can be either filled with objects or reference types (e.g., reference wrappers or pointers), therefore specifying an initial capacity makes sense, because pre-allocating memory for a sequence of large objects can be a big performance improvement. However, in Objective-C, collections can only contain references to dynamically allocated objects, i.e., pointers. As a consequence, I wonder what's the performance advantage of specifying a capacity if, in the worst case, just pointers will need to be copied if the size of the collection is to exceed the original capacity.

Clearly, there is a lack in my understanding of the memory model, so what am I missing?

Yep, the effect is smaller in Objective-C than in C++ because it's only pointers. But there's still unnecessary memory allocation/release/fragmentation/copy due to resizing which can affect performance in performance-critical code like long loops. — fluidsonic, Oct 01 '14 at 01:16
You are probably right. It's a small optimization since if the collection needs to be made bigger, the realloc and memcpy is just for the pointers and not the objects. — rmaddy, Oct 01 '14 at 01:17
I suspect that the ability to specify initial capacity is there mainly because some folks think it does some good and demand it. — Hot Licks, Oct 01 '14 at 01:21
It would actually be useful to specify a capacity k, if initWithCapacity would use the default init of the value type to instantiate k identical items. But there's no such method. — tunnuz, Oct 01 '14 at 01:29
@fluidsonic: right, there are still pointers to be copied, but I wonder if it's actually worth the effort for the typical Objective-C application. — tunnuz, Oct 01 '14 at 01:30
@tunnuz: yes, esp. since it's not just about copying but also about allocation, deallocation and fragmentation. All four tasks combined can quickly sum up to a lot of work in a loop. — fluidsonic, Oct 01 '14 at 01:32
Ok, if anyone of you could provide an extensive answer (@fluidsonic you seem to understand the matter pretty well) I can mark the question as answered. — tunnuz, Oct 01 '14 at 01:34
@tunnez: just refer to http://stackoverflow.com/a/12031319/1183577 which is a nice explanation — fluidsonic, Oct 01 '14 at 01:39
Just to be sure you're aware: you don't have to use the `...WithCapacity:` methods to create collection objects. Some people look at the class reference for, say, `NSMutableArray` and see only one convenience constructor: `+arrayWithCapacity:`. They assume that that's the only one available, forgetting that `NSMutableArray` inherits all of the methods of its superclasses. Since `NSArray` provides `+array` (and others), those are also available for use with `NSMutableArray`. — Ken Thomases, Oct 01 '14 at 05:17
@KenThomases: thanks for pointing that out. Of course there is the plain init. I was just wondering if adding a capacity parameter had some effect I wasn't aware of, since saving the allocation/deallocation of pointers didn't seem, alone, enough of a good reason (but fluidsonic correctly pointed out that it can be still a performance improvement in certain situations). — tunnuz, Oct 01 '14 at 05:23

bergdesign · Accepted Answer · 2014-10-01T21:47:44.380

Many Cocoa methods were implemented in OS X in its initial release, and quite possibly were implemented as far back as OpenStep or even NextStep. This means that they may have performed a significant role when maximizing performance on a 25MHz Motorola 68030 32-bit processor from the past. Beginning programmers can be spoiled by modern machines with gigabytes of memory and processor cycles measured in gigahertz, but older veteran programmers have developed many high-performance apps on machines with orders-of-magnitude less memory, CPU power and memory bandwidth than today's machines. It might have been extremely beneficial to save the memory reallocation time for thousands of array additions by allocating the required memory ahead of time.

I apologize for not being able to simply add a comment to the OP's post, but I felt like the idea of legacy usage required some additional perspective. It's always good to examine headers and take note of when any particular method was initially made available because it might have had significant importance in the past.

Update: From what I can locate online in publicly-maintained NextStep developer docs, it looks like NSMutableArray -initWithCapacity: and + arrayWithCapacity: were implemented at least as far back as 1994 in NextStep 3.3.

What is the point of collection capacity in Objective-C?

1 Answers1