Since I use streams a great deal, some of them dealing with a large amount of data, I thought it would be a good idea to pre-allocate my collection-based collectors with an approximate size to prevent expensive reallocation as the collection grows. So I came up with this, and similar ones for other collection types:
public static <T> Collector<T, ?, Set<T>> toSetSized(int initialCapacity) {
return Collectors.toCollection(()-> new HashSet<>(initialCapacity));
}
Used like this
Set<Foo> fooSet = myFooStream.collect(toSetSized(100000));
My concern is that the implementation of Collectors.toSet()
sets a Characteristics
enum that Collectors.toCollection()
does not: Characteristics.UNORDERED
. There is no convenient variation of Collectors.toCollection()
to set the desired characteristics beyond the default, and I can't copy the implementation of Collectors.toSet()
because of visibility issues. So, to set the UNORDERED
characteristic I'm forced to do something like this:
static<T> Collector<T,?,Set<T>> toSetSized(int initialCapacity){
return Collector.of(
() -> new HashSet<>(initialCapacity),
Set::add,
(c1, c2) -> {
c1.addAll(c2);
return c1;
},
new Collector.Characteristics[]{IDENTITY_FINISH, UNORDERED});
}
So here are my questions:
1. Is this my only option for creating an unordered collector for something as simple as a custom toSet()
2. If I want this to work ideally, is it necessary to apply the unordered characteristic? I've read a question on this forum where I learned that the unordered characteristic is no longer back-propagated into the Stream. Does it still serve a purpose?