0

I have a algorithm to build the intersection of two sorted lists. If I compare it with java.util.BitSet in a performance test, my algorithm is slow.

    public static List<Integer> intersection(List<Integer> list1, List<Integer> list2) {
            int size1 = list1.size(), size2 = list2.size();
            int capacity = size1 < size2 ? size1 : size2;
            List<Integer> intersection = new ArrayList<Integer>(capacity);
            int i1 = 0, i2 = 0;
            while (i1 < size1 && i2 < size2) {
                if (list1.get(i1) < list2.get(i2))
                    i1++;
                else if (list2.get(i2) < list1.get(i1))
                    i2++;
                else {
                    intersection.add(list2.get(i2++));
                    i1++;
                }
            }
            return intersection;
        }

Anyone sees any improvement?

Thanks

myborobudur
  • 4,385
  • 8
  • 40
  • 61
  • 1
    Use `BitSet`, or even `int[]`. Specifying a capacity for the `ArrayList` is probably a mistake. Probably doesn't need to be that large. Also you only need two `get`s per loop. Even that can be reduced if you only `get` on an associated increment. You may want to consider using `Iterator`s, particularly if you are given a `LinkedList` as an argument. – Tom Hawtin - tackline Mar 26 '13 at 10:02

2 Answers2

1

Are the inputs to your function always of type ArrayList?

  • If they are, algorithmically there is nothing wrong with your method. I'd make two changes:
    1. I'd change the argument types to ArrayList<Integer> list1, ArrayList<Integer> list2;
    2. I'd only call list1.get(i1) and list2.get(i2) once. This may or may not make any difference to performance, but stylistically I prefer to factor this out.
  • If you need to support any list, then I'd rewrite the function in terms of two iterators, since calling get(index) could be very expensive.

Finally, when testing performance, make sure to follow the advice given in How do I write a correct micro-benchmark in Java?

Community
  • 1
  • 1
NPE
  • 486,780
  • 108
  • 951
  • 1,012
0

You should know that this:

List<Integer> intersection = new ArrayList<Integer>(capacity);

allocates an internal array of size capacity.

Suppose list1.size() == 5000, and list2.size() == 5000, and intersection(list1, list2).size() == 3, the method would allocate 4997 useless integers.

Try using a reasonable capacity (depending on the use of the method) or just leave it to the default value (which is 10).

(Bare in mind that the complexity of allocating an array of size n (or an ArrayList of capacity n) is O(n).)

Ahmed KRAIEM
  • 10,267
  • 4
  • 30
  • 33