1

I was wandering if anyone made a benchmark on Apache CollectionUtils. In my simple benchmark:

List<Integer> ints = Arrays.asList(3, 4, 6, 7,8, 0,9,2, 5, 2,1, 35,11, 44, 5,1 ,2);
    long start = System.nanoTime();
    ArrayList<Integer> filtered = new ArrayList<Integer>(ints.size());
    for (Integer anInt : ints) {
        if (anInt > 10) {
            filtered.add(anInt);
        }
    }
    long end = System.nanoTime();
    System.out.println(filtered + " (" + (end - start) + ")");

    Predicate<Integer> predicate = new Predicate<Integer>() {
        @Override
        public boolean evaluate(Integer integer) {
            return integer > 10;
        }
    };
    start = System.nanoTime();
    filtered.clear();
    CollectionUtils.select(ints, predicate,filtered);
    end = System.nanoTime();
    System.out.println(filtered + " (" + (end - start) + ")");

I got the following results:

[35, 11, 44] (127643)
[35, 11, 44] (3060230)

I must say Im a big fan of this library coz it makes the code clean and testable but currently Im working on performance sensetive project and Im afraid my affection to this library gonna harm the performances.

I know this is a really general question, but any one used this library for production env? and noticed performance issues?

Noam Shaish
  • 1,613
  • 2
  • 16
  • 37
  • You should run it more than once. There might be JVM optimization you won't see with just one run because it would be optimizing. And the difference is simply the fact that your code invoke directly and CollectionUtils use a predicate being evaluated each time. – NoDataFound Aug 21 '14 at 13:23
  • @NoDataFound in that case (evaluation) using Predicate Transformer or Closure will always be slower the writing the code directly doesn't matter how many time ill run it – Noam Shaish Aug 21 '14 at 13:26
  • possible duplicate of [How do I write a correct micro-benchmark in Java?](http://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java) – Joe Aug 21 '14 at 13:33
  • @Joe how is it a duplication? the question is not specifically about this bench mark i made its more general performance of CollectionUtils of Apache... you might mark it as lazy developer :) since I try to find someone who already did this benchmark in a more professional way that i might do. And since this is a very common library of Apache I think its an important benchmark to be shared. – Noam Shaish Aug 24 '14 at 07:09

2 Answers2

0

Apart from running it multiple times to check for JVM optimization (I don't know if given the fact that Predicate can be a functional interface, the JVM could not use the new bytecode keyword invokedynamic introduced in Java 7), I think you error rely just after the start:

start = System.nanoTime();
filtered.clear();
CollectionUtils.select(ints, predicate,filtered);
end = System.nanoTime();
System.out.println(filtered + " (" + (end - start) + ")");

I don't think you should evaluate the time filtered.clear() does it work if you want to check differences between CollectionUtils and plain old foreach.

NoDataFound
  • 11,381
  • 33
  • 59
  • good point, but then i should also avoid the creation of the ArrayList in first loop. my assumption was that clearing and creating new instance should be equals (or maybe even creating new one should be longer) – Noam Shaish Aug 24 '14 at 07:05
  • My point is that you are comparing between two implementation of the same function, but you are taking into account operation that should not. So yes, the `new ArrayList()` should not be accounted for because `select` use an already existing list. – NoDataFound Aug 24 '14 at 10:44
0

Well, you are basically comparing method invocation overhead with inline code with the latter being obviously faster.

As long as you do not do something that really challenges your cpu, I would be very surprised if this would be the cause of performance problems in your application.

T. Neidhart
  • 6,060
  • 2
  • 15
  • 38