1

I was playing with Java * Stream API and I had Following code in the Lagecy system:

Iterator itr = configMap.entrySet().iterator();
        List<String> resultList = new ArrayList<String>();
        while(itr.hasNext()){
            Entry e = (Entry)itr.next();
            Config rc = (Config)e.getValue();

            if (rc != null && rc.getVisibleTo() != null && (rc.getVisibleTo().equalsIgnoreCase("0010") || rc.getVisibleTo().equalsIgnoreCase("0111") || rc.getVisibleTo().equalsIgnoreCase("0011"))){        
                resultList .add(rc.getName());
            }
        }

I wrote the Stream Equivalent of above code as below:

List<String> resultList = configMap.entrySet()
                   .parallelStream()
                   .map(r -> r.getValue())
                   .filter(r -> r.getVisibleTo() != null)
                   .filter(r -> {return 
                              r.getVisibleTo().equalsIgnoreCase("0010")
                           || r.getVisibleTo().equalsIgnoreCase("0111")
                           || r.getVisibleTo().equalsIgnoreCase("0011");
                        })
                   .map(r -> r.getName())
                   .collect(Collectors.toList());

Both the way I am getting the desired results. My question is which performance wise is better way of writing in this case? Am I actually gaining any value if I chose either one over another? The map has around 1000 values in it.

Prerak Tiwari
  • 3,436
  • 4
  • 34
  • 64
  • 4
    This looks like opinion based question. Some will say a is good. Some will say b is good. It may create an argument between people and affect this site. Also it may get you downvotes(not from me). So try edit your question – Sagar V Mar 15 '17 at 14:45
  • I would for readability create a separate predicate from .filter(r -> r.getVisibleTo() != null) .filter(r -> {return r.getVisibleTo().equalsIgnoreCase("0010") || r.getVisibleTo().equalsIgnoreCase("0111") || r.getVisibleTo().equalsIgnoreCase("0011"); }) – Viktor Mellgren Mar 15 '17 at 14:46
  • 1
    I would go with the more readable approach (i.e. the stream approach) – Svetlin Zarev Mar 15 '17 at 15:24
  • I really like this use of `equalsIgnoreCase`. You never know, perhaps there are lowercase `0`s or uppercase `1`s somewhere out there… – Holger Mar 16 '17 at 11:14
  • 1
    You are better off focusing on writing code that is clear, obvious, and maintainable, rather than on micro-performance issues. In the vast majority of cases, the obvious code is also well fast enough. – Brian Goetz Mar 16 '17 at 13:19

3 Answers3

2

First of all, if performance is your concern, you should decide based on facts, so measure, don't guess (i.e. using JMH).

For simple cases, iteration or classic loops tend to be faster than Streams. On the other hand, streams can be much faster, and can often be further accelerated by parallelism (where the workload is easily parallelizable). In most cases it doesn't really matter (in my experience) as either the dataset is too small and/or the code where it's used is not performance cricital enough.

Regarding readability/maintainabilty I tend to prefer the stream/functional approach for better separation of concerns. But a for-each loop might improve your iterative code as well. So there is no simple yes/no answer, it all depends on the context.

Brian Goetz
  • 90,105
  • 23
  • 150
  • 161
Gerald Mücke
  • 10,724
  • 2
  • 50
  • 67
  • 1
    With enough data, streams are often faster than iteration (even in sequential execution). The iteration protocol used by streams (`Spliterator`) has far lower per-element overhead than `Iterator`. – Brian Goetz Mar 16 '17 at 13:17
2

Since you wrote this as a parallel stream, which attempts to multi thread and divide the work, you are likely to see some performance degradation from overhead managing the threads, given that your job is a bit small. Another trap here is that since you are using the CommonForkJoinPool (the default thread pool for this), this is not a great choice as blocking that up with tasks can degrade performance for the entire app (which, while not terribly relevant for the small size of your job, is a thing to keep in mind).

That being said, were you to use the single-threaded configMap.entrySet().stream() instead, I would say the performance difference might be very slightly slower, but the readability more than makes up for it. So the choice becomes a decision between slightly faster code or better readability, which is entirely up to you (but I'd pick the stream, personally).

Brandon McKenzie
  • 1,655
  • 11
  • 26
2

I would just slightly rewrite sream example:

List<String> resultList = configMap.entrySet()
        .parallelStream()
        .map(Map.Entry::getValue)
        .filter(Objects::nonNull) // value might be a null
        .filter(r ->
                ((Predicate<String>) "0010"::equals)
                        .or("0111"::equals) // null-save equivalent to  "0111".equals(value)
                        .or("0011"::equals)
                        .test(r.getVisibleTo())
        )
        .map(Config::getName)
        .collect(Collectors.toList());

and will use it. For detailed performance explanation please see these question Java 8: performance of Streams vs Collections and Iterator versus Stream of Java 8

as suggested in comments below by @4castle, we can make it even shorter by using configMap.values() instead of configMap.entrySet() :

List<String> resultList = configMap.values()
        .parallelStream()
        .filter(Objects::nonNull)
        .filter(r ->
                ((Predicate<String>) "0010"::equals)
                        .or("0111"::equals) // null-save equivalent to  "0111".equals(value)
                        .or("0011"::equals)
                        .test(r.getVisibleTo())
        )
        .map(Config::getName)
        .collect(Collectors.toList());
Community
  • 1
  • 1
Anton Balaniuc
  • 10,889
  • 1
  • 35
  • 53
  • 1
    The null checks also aren't being handled properly. Perhaps it would be simpler to put them in a separate `.filter`. – 4castle Mar 15 '17 at 15:51
  • 1
    @4castle, do you mean an additional filter to check if `map.value` is a `null`? – Anton Balaniuc Mar 15 '17 at 15:56
  • 1
    Yes, and also the null check for `.getVisibleTo()`, because it's using `or` instead of `and`. – 4castle Mar 15 '17 at 15:59
  • 1
    @4castle, yes. You are perfectly right. I will update my answer. Thanks a lot for spotting this. I guess we don't really need to check if `getVisibleTo` is `null`, since `"0010"::equals` is `null-safe` comparison. We only need to check if a value itself is not null. – Anton Balaniuc Mar 15 '17 at 16:06
  • It’s great to mention that you can make the code simpler by streaming over `configMap.values()` in the first place, but repeating the original code without any change doesn’t really illustrate the point. Besides that, there is no reason to insert a `.filter(Objects::nonNull)` into the chain that doesn’t exist in the original code. That’s misleading at best. – Holger Mar 16 '17 at 11:19
  • @Holger, Thanks a lot for you comment. There is `null` check in the `iterator` version: `if (rc != null && rc.getVisibleTo() != null` so `rc` which is a `map.value` might be null itself. However it is not included in the stream version in the original question: `.map(r -> r.getValue()) .filter(r -> r.getVisibleTo() != null)` – Anton Balaniuc Mar 16 '17 at 12:22
  • @Holger, I am sorry, I updated my answer, I probably copy-paste it by mistake. – Anton Balaniuc Mar 16 '17 at 13:41