I realize a resizable indexed collection that uses an array to store its elements (like List<T>
in .NET or ArrayList
in Java) has amortized O(1) insertion time at the end of the collection. But then there's always that one pesky insertion at critical junctures where the collection has just reached its capacity and the next insertion requires a full copy of all elements in the internal array to a new one (presumably twice as large).
A common mistake (in my opinion) is to go with a linked list to "fix" this problem; but I believe the overhead of allocating a node for every element can be quite wasteful, and in fact would dwarf the benefit of a guaranteed O(1) insertion in that rare case that the array insertion is costly—when, in fact, every other array insertion is significantly cheaper (and faster).
What I was thinking might make sense is a hybrid approach consisting of a linked list of arrays, where every time the current "head" array reaches its capacity, a new array twice as large is added to the linked list. Then no copy would be necessary since the linked list would still have the original array. Essentially this seems analogous (to me) to the List<T>
or ArrayList
approach, except that wherever you previously would've incurred the cost of copying all the elements of the internal array, here you only incur the cost of allocating a new array plus a single node insertion.
Of course, this would complicate other features if they were desired (e.g., inserting/removing into/from the middle of the collection); but as I've expressed in the title, I'm really just looking for an add-only (and iterate) collection.
Are there any data structures out there ideally suited to this purpose? Or, can you think of one yourself?