Performance issue - clear and reuse a collection OR throw it and get a new one

Question

Say we try to implement a merge sort algorithm, given an Array of Arrays to merge what is a better approach, this:

public void merge(ArrayList<ArrayList<E>> a) {
    ArrayList<ArrayList<E>> tmp = new ArrayList<ArrayList<E>>() ;
    while (a.size()>1) {
        for (int i=1; i<a.size();i+=2) {
            tmp.add(merge(a.get(i-1),a.get(i)));
        }
        if (a.size()%2==1) tmp.add(a.get(a.size()-1));
        a = tmp;
        tmp = new ArrayList<ArrayList<E>>() ;
    }
}

or this :

public void merge(ArrayList<ArrayList<E>> a) {
    ArrayList<ArrayList<E>> tmp = new ArrayList<ArrayList<E>>(),tmp2  ;
    while (a.size()>1) {
        for (int i=1; i<a.size();i+=2) {
            tmp.add(merge(a.get(i-1),a.get(i)));
        }
        if (a.size()%2==1) tmp.add(a.get(a.size()-1));
        tmp2 = a;
        a = tmp;
        tmp = tmp2;
        tmp.clear();
    }
}

to make it clearer, what i was doing is to merge each couple of neighbors in a and put the resulting merged arrays in an external Array of Arrays tmp, after merging all couples, one approach is to clear a and then move tmp to a, and then move the cleared a to tmp. second approach is to "throw" old tmp and get a new tmp instead of reusing the old one.

Guessing at any piece of code about it's performance is invalid. Run a profiler and determine where the real performance problem is. — Andrew T Finnell, Apr 17 '12 at 19:51

score 6 · Answer 1 · answered Apr 17 '12 at 19:45

6

As a general rule, don't spend energy trying to reuse old collections; it just makes your code harder to read (and frequently doesn't give you any actual benefit). Only try optimizations like these if you already have your code working, and you have hard numbers that say the speed of your algorithm is improved.

answered Apr 17 '12 at 19:45

Louis Wasserman

191,574
25
345
413

1

It looks like this was downvoted? Louis is right. One shouldn't try to pre-optimize based on no hard evidence. Write the code to make sense first, then optimize if an area show's a problem. – Andrew T Finnell Apr 17 '12 at 19:58
He didn't ask about if he should optimize this code. And, yes, premature optimization is the root of all evil, but there are many cases in Java where you can identify possible performance issues before profiling them. – Neet Apr 17 '12 at 20:01
In that case this should be closed as Too Localized. The answers being given aren't being proved through an objective means and the answers are unlikely to help anyone else. Rather it'll encourage complicated code over clean code bases on zero evidence. – Andrew T Finnell Apr 17 '12 at 20:06
1

It'll vary from case to case, possibly machine to machine, and definitely JVM release to JVM release. The only way to answer this question is with hard numbers from benchmarks on the machines this code will run on, and if you don't have that, then you should default to creating new collections, as I discussed. – Louis Wasserman Apr 17 '12 at 21:29

Neet · Answer 2 · 2012-04-17T19:54:33.830

2

Always allocating a new ArrayList and filling it, will result in more garbage collections which generally slows down everything (minor GCs are cheap but not free).
Reusing the ArrayList will result in less Arrays.copyOf() which is used when the array inside the ArrayList needs to be resized (resizing is cheap but not free).

On the other hand: clear() will also nullify the array content to allow the GC to collect unused object which is of course also not free.

Still, if execution speed is concerned, I would reuse the ArrayList.

edited Apr 17 '12 at 19:54

answered Apr 17 '12 at 19:48

Neet

3,937
15
18

Can you back up your assertions in the first bullet? What leads you to think this will result in more GCs? – Alex Feinman Aug 14 '13 at 13:19

Performance issue - clear and reuse a collection OR throw it and get a new one

2 Answers2

Linked

Related