4

My goal is to store the count of each item of a List inside a Map. This can be achieved by the groupingBy() and counting() methods.

The next constraint I have, is that for a value not in the List I would still need a mapping for that key to be 0. Thus all possible values must be defined.

Here's what I've come up with:

Map<String, Long> EMPTY = Map.of("a", 0L,
                                 "b", 0L,
                                 "c", 0L,
                                 "d", 0L);

List<String> list = List.of("a", "a", "d", "c", "d", "c", "a", "d");

Map<String, Long> count = list.stream()
                              .collect(groupingBy(s -> s,
                                                  () -> new HashMap<>(EMPTY),
                                                  counting()));

This code throws following exception:

Exception in thread "main" java.lang.ClassCastException: class java.lang.Long cannot be cast to class [J (java.lang.Long and [J are in module java.base of loader 'bootstrap')
    at java.base/java.util.stream.Collectors.lambda$groupingBy$53(Collectors.java:1129)
    at java.base/java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
    at java.base/java.util.AbstractList$RandomAccessSpliterator.forEachRemaining(AbstractList.java:720)
    at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
    at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
    at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
    at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
    at Test.main(Test.java:18)

But if I just replace new HashMap<>(EMPTY) with new HashMap<>() the code works fine.

Am I violating something by not using an empty Map for the collect process? How would I otherwise achieve my goal using streams?

Stefan Zobel
  • 3,182
  • 7
  • 28
  • 38
jippiee
  • 85
  • 2
  • 6
  • 1
    Your `mapFactory` is expected to produce an **empty** map. `() -> new HashMap<>(EMPTY)` is where the trouble is. Just use `() -> new HashMap<>()` (or `HashMap::new`) – ernest_k Jan 29 '19 at 17:05
  • @ernest_k where is that precised ? :) – azro Jan 29 '19 at 17:06
  • 3
    The javadocs of `java.util.stream.Collectors.groupingBy(Function super T, ? extends K>, Supplier, Collector super T, A, D>)` state: **`mapFactory`**: a supplier providing a new empty Map into which the results will be inserted. The stack trace leads to implementation details. Making the supplier produce an empty map resolves the problem. @azro – ernest_k Jan 29 '19 at 17:08

3 Answers3

3

It's a bit of a weird error. Specifically, the collector that you're using (by virtue of using Collectors.counting) is actually accumulating into single element arrays of primitive longs.

public static <T> Collector<T, ?, Long> summingLong(ToLongFunction<? super T> mapper) 
{
    return new CollectorImpl<>(
        () -> new long[1],
        (a, t) -> { a[0] += mapper.applyAsLong(t); },
        (a, b) -> { a[0] += b[0]; return a; },
        a -> a[0], CH_NOID);
}

When groupingBy does a computeIfAbsent, it's expecting to get a long[] but because you already have a key for "a", you get back a Long which doesn't match the type that's accepted by the accumulator. That's what throws the exception.

A container = m.computeIfAbsent(key, k -> downstreamSupplier.get());
downstreamAccumulator.accept(container, t);

Later on, they replace all of the map values:

intermediate.replaceAll((k, v) -> downstreamFinisher.apply(v));

using the 'finisher' defined above (a -> a[0]) to go from long[]s to Longs.


Yes, it's a little bit naughty but you've violated the contract

mapFactory: a supplier providing a new empty Map into which the results will be inserted

so it's also sort of fair enough. They're taking a HashMap which at compile-time was decided to be Map<String, Long> and they're putting long[]s into it. That's possible because generics are not reified. At runtime, it's simply a HashMap capable of storing any types of keys and values.

Michael
  • 41,989
  • 11
  • 82
  • 128
  • where did you get the part about *They're taking a HashMap which at compile-time was decided to be Map and they're putting long[]s into it*? There are some type arguments there, like `K` and `A`, but these are inferred to `String` and `long[]`, not entirely sure what you are referring to – Eugene Jan 29 '19 at 22:28
  • @Eugene the type of the map factory is `M extends Map>`. it's inferred to be `Map`. If you try to explicitly specify `long[]` to `groupingBy` then it won't compile – Michael Jan 29 '19 at 22:41
  • what will not compile? well anyway, the actual code that gathers elements is `BiConsumer, T> accumulator = (m, t) -> {...}` where `A` is inferred to `long[]`; there is a `finisher` that will map into the correct `Map` type – Eugene Jan 29 '19 at 23:10
  • @Eugene I don't think I can explain it any better in words. [Here is an example](https://ideone.com/Zk3vez). If you uncomment the third one then it won't compile. – Michael Jan 30 '19 at 10:06
  • 1
    @Eugene this has been discussed [here](https://stackoverflow.com/a/37434351/2711488); the downstream collector’s intermediate container type is actually incompatible with the resulting `Map`’s value type when the finisher function is not the identity function. But a type safe alternative would need to copy the entire map during finishing and would also need a second map supplier. – Holger Jan 30 '19 at 11:57
2

As the doc states for Collectors.groupingBy​(classifier, mapFactory, downstream)

Parameters:

  • classifier - a classifier function mapping input elements to keys
  • downstream - a Collector implementing the downstream reduction
  • mapFactory - a supplier providing a new empty Map into which the results will be inserted*
azro
  • 53,056
  • 7
  • 34
  • 70
1

Stream the list and use groupingBy and counting to convert it into Map<String,Long> and then use foreach update values in EMPTY map

list.stream().collect(Collectors.groupingBy(Function.identity(),Collectors.counting()))
    .forEach((k,v)->empty.put(k, v));   //java coding standards variable name should be in lower case 
Ryuzaki L
  • 37,302
  • 12
  • 68
  • 98
  • The EMPTY map should be a static final field and thus not mutated. Though I could use the foreach in this way: `Set MAPPINGS = Set.of("a", "b", "c", "d"); List list = List.of("a", "a", "d", "c", "d", "c", "a", "d"); Map count = list.stream().collect(groupingBy(s -> s, counting())); MAPPINGS.forEach(s -> count.putIfAbsent(s, 0L));` – jippiee Jan 29 '19 at 17:47
  • That should be fine, it depends on how you want it @jippiee – Ryuzaki L Jan 29 '19 at 17:51