First things first, a relevant extract from the documentation of Collectors.toList()
:
[...]There are no guarantees on the type, mutability, serializability, or thread-safety of the List returned; if more control over the returned List is required, use toCollection(Supplier)
Now, let us look a little more deeply into a collector's characteristics; we find this:
public static final Collector.Characteristics CONCURRENT
Indicates that this collector is concurrent, meaning that the result container can support the accumulator function being called concurrently with the same result container from multiple threads.
If a CONCURRENT collector is not also UNORDERED, then it should only be evaluated concurrently if applied to an unordered data source.
Now, nothing guarantees that the collector returned by Collectors.toList()
is Concurrent
at all.
Notwithstanding the time which it may take to initiate a new class of yours, the safe bet here would be to assume that this collector is not concurrent. But fortunately we have a means to use a concurrent collection instead, as mentioned in the javadoc. So, let's try:
.collect(
Collector.of(CopyOnWriteArrayList::new,
List::add,
(o, o2) -> { o.addAll(o2); return o; },
Function.<List<String>>identity(),
Collector.Characteristics.CONCURRENT,
Collector.Characteristics.IDENTITY_FINISH
)
)
This may speed things up.
Now, you have another problem. You do not close you stream.
This is little known but a Stream
(whether of any type or an {Int,Double,Long}Stream for that matter) implements AutoCloseable
. You want to close streams which are I/O bound and Files.lines()
is such a stream.
So, try this:
final List<MyClass> list;
try (
final Stream<String> lines = Files.lines(...);
) {
list = lines.parallel().map(MyClass::new)
.collect(seeAbove);
}