-4

The generally used way of adding item to array is by using ArrayList variable instead of generic array and calling the add method.

I wrote this method of adding item to a generic array:

public static <T> T[] addItemToArray(T[] objects, T object) {
    T[] temp = null;
    temp = Arrays.copyOf(objects, objects.length + 1);
    temp[objects.length] = object;
    return temp;
}

Which of these method is more efficient in terms of speed and memory management and why?

Additional Information

This method is used in runtime if and when the need arises to add a new item and not just iterate over a loop to populate an array. Also noteworthy is that the size of the array would never exceed 5. Given this case would it be better to use this custom method instead of ArrayList?

Blip
  • 3,061
  • 5
  • 22
  • 50

3 Answers3

3

ArrayList.add() will be less memory efficient (in some ways) but faster than your implementation. The reason is that ArrayList.add() does not increase the size of the underlying array with every add operation. It only increases the size sometimes and it increases it by more than 1.

Specifically the array size is increased with the following formula when it runs out of space:

int newCapacity = (oldCapacity * 3)/2 + 1;

This increases the size of the array by roughly 1/2 when it is full.

This means that in most cases ArrayList.add() just puts an item in the array and only sometimes does the more expensive operation of increasing the size of the array. The downside of this is that ArrayList allocates slightly more space than is needed at any one time and this is more than worth it because of the relative expense of increasing the size of the array with every operation.

bhspencer
  • 13,086
  • 5
  • 35
  • 44
  • Pretty sure that old*3/2+1 is much closer to 50% increase than to 33% – Diego Martinoia Jun 22 '15 at 14:08
  • I am using this `addItemToArray` to dynamically add items in runtime when there is a need. Also the size of the array will never be greater than 5. In this case am I better off using my custom function? – Blip Jun 22 '15 at 15:39
  • If you know the size of the array is never greater than 5 I would recommended just to allocate an array of size 5 at construction and leave it at that. The array only contains the pointer to the object and so the array its self is small. – bhspencer Jun 22 '15 at 15:52
2

I am going to slightly disagree with some of the other answers, and say that ArrayList.add() will clearly be both more memory efficient and faster.

ArrayList will in fact create an array that is a bit bigger in size than what it needs intially, seemingly being a little less memory efficient. But it will do its best to avoid having to create many unnecessary array instances for every add operation. This is both faster and more memory efficient than your implementation.

In your implementation, for every add, you are creating a whole new array instance. This is not memory efficient, and will add more work for the garbage collector.

EDIT:

Here is an example that illustrates what I mean by the ArrayList being more memory efficient as well.

Let's say, we try to add 5 items to a list using both methods.

Method 1: Using built-in ArrayList

ArrayList<String> list = new ArrayList<>(); // size of internal array: 10.
list.add("a"); // size of internal array: still 10.
list.add("b"); // size of internal array: still 10.
list.add("c"); // size of internal array: still 10.
list.add("d"); // size of internal array: still 10.
list.add("e"); // size of internal array: still 10.

In total, a single array of size 10 was created.

Method 2: Custom implementation of add using normal arrays.

String[] list = new String[0]; // 1 array of size 0.
list = addItemToArray(list, "a"); // 2nd array instance of size 1.
list = addItemToArray(list, "b"); // 3rd array instance of size 2.
list = addItemToArray(list, "c"); // 4th array instance of size 3.
list = addItemToArray(list, "d"); // 5th array instance of size 4.
list = addItemToArray(list, "e"); // 6th array instance of size 5.

In total, we've created 6 arrays, of a total combined size of 15, not counting the memory overhead associated to each created object.

So, as can be seen, it doesn't take long before using the custom implementation starts costing more in memory.

That's why it's better to trust the built-in ArrayList implementation for both memory and speed, rather than our own.

The exception to that would be if you don't plan on calling add very often, or if you know the size that you need your array to be in advance. But even then, you can simply instantiate the ArrayList by passing in a more specific initialCapacity to its constructor to fit your specific needs.

sstan
  • 35,425
  • 6
  • 48
  • 66
  • sstan you are right about creating multiple instances, but I'd say that both our other answers were considering memory usage at the end of any cleanup operations, thus disregarding the GC work – Diego Martinoia Jun 22 '15 at 14:06
  • @Diego: Actually, even disregarding GC work, with his implementation, you will have more memory being used by all the arrays that he creates in every add operation. All it takes is a few adds, and he will easily have used more memory than what the ArrayList will have. – sstan Jun 22 '15 at 14:08
  • Sure, but again, if you only consider the memory usage "at steady state", that will be more. Profiling for any transitory state becomes too dependent on the profile of the operations' sequence. – Diego Martinoia Jun 22 '15 at 14:10
  • @Diego: Not sure I understood your last point. All I'm saying is that I'm being realistic: I don't know when the GC runs, nor should I rely on knowing. So I think that it is a lot more likely that his implementation will cause more memory usage at any given time than that of the ArrayList's. The only exception would be if he already knows in advance how big his array should be, or if he never calls his `add` method. – sstan Jun 22 '15 at 14:13
  • You are right on this. What I'm trying to say is that you should not worry about the GC, and make your estimate on "steady-state" behaviors: i.e. what is the memory usage at the end of a given sequence of operations AFTER all that should happen(GC included) has settled? Trying to estimate on every intermediate state(i.e. during/before GC runs) includes too much complexity for an estimate to be reasonable. Where do you stop then? do you consider also load/store in swap memory/disk? CPU interrupts? too much stuff :) – Diego Martinoia Jun 22 '15 at 14:27
  • @Diego: I see your point, and I do appreciate your feedback! However, I feel that this question is very similar to why we recommend using `StringBuilder` over concatenating strings directly in a loop. It's not just a matter of speed. We also consider it best practice because we want to avoid the memory overhead of creating a bunch of smaller intermediate strings, even if we know that eventually, the GC will clean them up. – sstan Jun 22 '15 at 14:40
  • I am using this `addItemToArray` to dynamically add items in runtime when there is a need. Also the size of the array will never be greater than 5. In this case am I better off using my custom function? – Blip Jun 22 '15 at 15:39
  • Sounds like small potatoes. Performance and efficiency are not really a concern with such small objects. That being the case, I would still go with the built-in `ArrayList`, if anything to keep things more readable and standard for anyone else looking at the code. I wouldn't want to reinvent the wheel in this case. – sstan Jun 22 '15 at 15:44
0

ArrayList works by having a fixed size array, and "scaling up" exponentially, i.e. if you have to add one item and your currenty backend array is full, you don't copy it to an array of size +1, but to one of size *1.5. This is done in order to minimize the number of copy operations, which are the "expensive" ones, at the cost of possibly wasting memory.

ArrayList also has a nice method called .ensureCapacity(int n) that preallocates your array to a certain minimum size. You can call this before of a long series of .add operation to pay the cost of the copy only once, rather than multiple times during your insertions.

In other words: for any significant number of insertion, ArrayList will perform better in time and worse in memory than your implementation.

Diego Martinoia
  • 4,592
  • 1
  • 17
  • 36
  • I am using this `addItemToArray` to dynamically add items in runtime when there is a need. Also the size of the array will never be greater than 5. In this case am I better off using my custom function? – Blip Jun 22 '15 at 15:28