3

Say I have to read out data, which can be either 1 object (majority of time) or multiple objects (some of the time).

If I do:

List list = new ArrayList<Object>(1);
... loop over the loaded object(s) and add it/them to the list...

This will serve me well the majority of times when there is only 1 object loaded from the database. But assuming the less common scenario where I have to expand my initial list, this will cause me to lose operations.

I assume this won't really make much of an impact in the real world, but I wonder how I could calculate the following:

Assume X% of my data is 1 object, and Y% is a list of multiple objects. Is there a way I can calculate the ideal initial capacity for my list, going for the least operations (through list expansions, allocated but unused fields in the list)?

arnehehe
  • 1,386
  • 1
  • 17
  • 33
  • Have you looked into `ArrayList`s `trimToSize()` method? I feel it might be useful for you. – Ogen Oct 30 '14 at 09:12
  • It is depend on how many objects in case of "multiple objects" – talex Oct 30 '14 at 09:19
  • `ArrayList` expands its size as it grows (in such a way that the expansion doesn't happen too often). It's "expensive" to grow the size of the dynamic array, but it's _not_ usually too expensive to waste memory (unless you are doing something that is very memory intensive). There is going to be a tradeoff no matter what. If the majority of your `ArrayList`'s are going to be of size 1 then this is the size you should initialize it to if your are concerned with memory. If memory is _not_ a concern, then you should initialize each `ArrayList` to the maximum possible size. – Jared Oct 30 '14 at 09:19
  • If memory is becoming a concern, the size of the `ArrayList`'s are probably _not_ going to be the bottleneck. – Jared Oct 30 '14 at 09:20
  • 1
    If you have a binary relation where x% will be 1 and y% will be N, then the expected length of your array will be 1*x% + N * y%. For example if 50% of your arrays will be of length 1 and 50% will be of length 10, then the expected length of each array will be 0.5 + 10 * 0.5 = 5.5 So you should set your `ArrayList` to either 5 or 6 (on average), but this will waste a lot of memory for virtually no computational gain. – Jared Oct 30 '14 at 09:25
  • 2
    It seems fine to init your ArrayList with 1 if X% is high. But if the average size in the Y group is high, you could use ensureCapacity(avgSize) on the second iteration to avoid too many array resize/copy. Starting at 1 and for 100 elements there will be 12 array copies and will result in an backing array length of 141 elements. – nomoa Oct 30 '14 at 09:46
  • @nomoa +1. I think your comment is the answer to the question. – leventov Oct 30 '14 at 11:17

1 Answers1

2

You dissociate your data into 2 groups X (1 element) and Y (more than one). You optimized your code for the X group because it is the most common case.

It's a good idea to initialize your ArrayList with one element so most of the time you won't waste any memory.

But if the members of the Y group have a high average size (and a small standard deviation) you can still optimize the worstcase with ensureCapacity(int cap). On the second iteration you can force to resize the ArrayList backing array to the average size of the Y group.

For a member of the Y group with 100 elements it will create/copy arrays 12 times and the backing array length will be 141 against 1 small array copy and no wasted memory if you implement the optimization.

Example of this optimization :

Iterator<Obj> it = // Get your iterator from your resource
ArrayList<Obj> result = new ArrayList<Obj>(1);
if(it.hasNext()) {
    result.add(it.next());
}
if(it.hasNext()) {
    result.ensureCapacity(100);// Avg size of the Y group
    while(it.hasNext()) {
        result.add(it.next());
    }
}

But unless it's a performance critical feature It's not worth the effort. Because to make sure this trick will optimize speed and memory you have to analyse the distribution of the size in the Y group.

It's not drectly related to your problem but it contains a lot of useful comments on ArrayList : When to use LinkedList over ArrayList?

Community
  • 1
  • 1
nomoa
  • 1,043
  • 6
  • 18