It is very safe to use parallelStream()
to collect into a HashMap
. However, it is not safe to use parallelStream()
, forEach
and a consumer adding things to a HashMap
.
HashMap
is not a synchronized class, and trying to put elements in it concurrently will not work properly. This is what forEach
will do, it will invoke the given consumer, which puts elements into the HashMap
, from multiple threads, possibly at the same time. If you want a simple code demonstrating the issue:
List<Integer> list = IntStream.range(0, 10000).boxed().collect(Collectors.toList());
Map<Integer, Integer> map = new HashMap<>();
list.parallelStream().forEach(i -> {
map.put(i, i);
});
System.out.println(list.size());
System.out.println(map.size());
Make sure to run it a couple of times. There's a very good chance (the joy of concurrency) that the printed map size after the operation is not 10000, which is the size of the list, but slightly less.
The solution here, as always, is not to use forEach
, but to use a mutable reduction approach with the collect
method and the built-in toMap
:
Map<Integer, Integer> map = list.parallelStream().collect(Collectors.toMap(i -> i, i -> i));
Use that line of code in the sample code above, and you can rest assured that the map size will always be 10000. The Stream API ensures that it is safe to collect into a non-thread safe container, even in parallel. Which also means that you don't need to use toConcurrentMap
to be safe, this collector is needed if you specifically want a ConcurrentMap
as result, not a general Map
; but as far as thread safety is concerned with regard to collect
, you can use both.