12

I have a requirement to check whether there is a common element in two lists. I came up with two ways to do this:

Method 01 : Loops

private boolean func01 (List<String> list1, List<String> list2) {
    for (String group : list1) {
        for (String funcGroup : list2) {
            if (group.equals(funcGroup)) {
                return true;
            }
        }
    }
    return false;
}

Method 02 : Lambda

private boolean func02 (List<String> list1, List<String> list2) {
    return list1.stream().filter(list2::contains).findAny().isPresent();
}

In my view, I find the first method more readable. What I need to understand is, whether there are any differences or advantages when comparing these two methods?

TT.
  • 15,774
  • 6
  • 47
  • 88
ycr
  • 12,828
  • 2
  • 25
  • 45
  • 5
    The first version is going to be more efficient in general, but the difference won't necessarily be that big. But you can also simplify the first one by replacing the inner loop with `if (list2.contains(group)) { return true; } ` – Louis Wasserman Apr 04 '18 at 03:59
  • 2
    `return list1.stream().anyMatch(list2::contains);` seems okay. The actual problem is that using `Set` instead of `List` would be worthwhile _hugely more_. To the extend that creating a set filled by the list would be worth the costs. – Joop Eggen Apr 04 '18 at 08:31

3 Answers3

4

method 1 optimization:

you don't need 2 loops and you can return immediately when you have a match, so you stop the traversing of the list at that point - (e.g. sunshine case you get a match on first element - you have 1 iteration, worst case your match is the last element - you have to traverse the whole list to hit that match)

private boolean func01 (List<String> list1, List<String> list2) {
        for (String group : list1) {
            if (list2.contains(group)) return true;
        }

        return false;
    }

lambda equivalent optimization:

  • findAny().isPresent() - get an optional of an element matching the predicate and check if the Optional is present - this is equivalent of anyMatch() because both expressions return boolean

  • filter() will always traverse the whole list

  • anyMatch() has a short-circuit behavior - means it will stop on the first match

so you can re-write it as:

private boolean func02 (List<String> list1, List<String> list2) {
  return list1.stream().anyMatch(list2::contains);
}

To answer your question - there's no significant performance difference in the two approaches, but take into consideration:

  • creating stream for a collection has a slight overhead

  • stream operations can be run in parallel (list1.stream().parallel().anyMatch(list2::contains)). For example in this case anyMatch() running in parallel threads over the same stream will periodically check if the previous threads found a match and will stop traversing the collection and not continue traversing the whole collection. So in theory for significantly large input lists, you should get better results with parallel streams.

hovanessyan
  • 30,580
  • 6
  • 55
  • 83
  • that contains is still a for-each internally though with equals... things are more interesting if you would use something hash based, where contains would be preferred instead of the loop again... – Eugene Apr 04 '18 at 10:02
  • Yes, contains internally uses for-loop, but it's far more readable to use list.contains in comparison with in-lining the internals of contains in your own code, right? – hovanessyan Apr 04 '18 at 10:10
  • as I said if you are comparing something that is hash based, contains is better. In your specific case, I would agree – Eugene Apr 04 '18 at 10:12
4

To answer your direct question, if you want to know whether there's a performance gain by using lambda expressions, you should create a microbenchmark and measure.

However, I would like to point out that your 2nd solution not only uses a lambda expression (actually, a method reference), but also a Stream. In general, stream-based solutions take longer, due to all the infraestructure needed to run the stream pipeline. However, it also happens that, in most cases, these solutions scale much better.

Now, as to your specific problem, the best way to check whether there is a common element in two lists, is by using the Collections.disjoint method, available since Java 1.5:

return !Collections.disjoint(list1, list2);
fps
  • 33,623
  • 8
  • 55
  • 110
2

You can find a very related comparison here for example.

The performance is going to be irrelevant once the code is hot enough (the first would win though)

Considering which is easier to read - your opinion is the first one, mine is the second. Once you get around with streams in java-8 there is no turning back, literally.

Besides, there are multiple cases where doing things with stream-api is far more readable, being able to parallelize. Also, streams have internally different optimizations that make them very good. These optimizations would have to be done by you mainly

Eugene
  • 117,005
  • 15
  • 201
  • 306