8

While I was doing a little programming exercise i stumbled upon a ClassCastException. As background i'm giving a simplified version of the exercise to demonstrate the problem:

Given a string which contains only the characters A or B compute a map with the characters as keys and the number of occurrences as values. Additionally the map should always contain both characters as key (with value zero if a character is missing in the input string).

Examples:

  • "A" => {A=1, B=0}
  • "AAB" => {A=2, B=1}

My first solution was the following:

import static java.util.stream.Collectors.counting;
import static java.util.stream.Collectors.groupingBy;

public Map<Character, Long> createResult(String input) {
    Map<Character, Long> map = input.chars()
        .mapToObj(c -> (char) c)
        .collect(groupingBy(c -> c, counting()));

    map.putIfAbsent('A', 0L);
    map.putIfAbsent('B', 0L);
    return map;
}

This solution worked but i wanted to try if it was possible to supply a map with default values to the groupingBy function:

public HashMap<Character, Long> createResult2(String input) {
    return input.chars()
        .mapToObj(c -> (char) c)
        .collect(groupingBy(c -> c, this::mapFactory, counting()));
}

private HashMap<Character, Long> mapFactory() {
    HashMap<Character, Long> map = new HashMap<>();
    map.put('A', 0L);
    map.put('B', 0L);
    return map;
}

When calling createResult2 with input A a ClassCastException is thrown at runtime:

java.lang.ClassCastException: java.lang.Long cannot be cast to [Ljava.lang.Object;
    at java.util.stream.Collectors.lambda$groupingBy$45(Collectors.java:909)
    at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
    at java.util.stream.IntPipeline$4$1.accept(IntPipeline.java:250)
    at java.lang.CharSequence$1CharIterator.forEachRemaining(CharSequence.java:149)
    at java.util.Spliterators$IntIteratorSpliterator.forEachRemaining(Spliterators.java:1908)
    at java.util.Spliterator$OfInt.forEachRemaining(Spliterator.java:693)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)

Can anyone explain why this is happening?

eee
  • 3,241
  • 1
  • 17
  • 34
  • Streams look like overkill for this simple problem. Even worse, the stacktrace contains nothing about where *your* code caused an error. That would be reason enough for me to use a simple loop. –  Nov 02 '16 at 17:39
  • The last line in the stacktrace corresponds to the `collect` call in my code snippet. – eee Nov 02 '16 at 17:42
  • 1
    @LutzHorn This question is more about why this is happening and less about how to solve the exercise. – eee Nov 02 '16 at 17:44
  • 1
    Well, it is happening because nobody really understands streams and stacktraces thrown when streams are involved are not helpful at all :) –  Nov 02 '16 at 17:46
  • 5
    The map supplier should never provide a non-empty map. Besides, [this answer](http://stackoverflow.com/a/37434351/2711488) may shed some light on the issue. – Holger Nov 02 '16 at 18:16
  • 1
    Note when using `input.chars().partitioningBy(c -> c=='A', counting())`, you’ll get a `Map` in which both keys are always present, mapping to zero if no matching characters were found, but the keys will be `Boolean.TRUE` for `'a'` and `Boolean.FALSE` for `'b'` (actually “not `'a'`”)… – Holger Nov 02 '16 at 18:34
  • @Holger Good point, the original exercise included more than two characters though. The link you provided is very interesting. – eee Nov 02 '16 at 22:01

2 Answers2

3

There is type inference magic involved, but if you want solution here it is:

Replace

map.put('A', 0L);
map.put('B', 0L);

by

map.put('A', new Object[]{0L});
map.put('B', new Object[]{0L});

But I strongly discourage you of using this solution in practice. Implementation details can be changed in any update and this hack stop works.

Here more explanation about "why" from groupingBy javadoc

mapFactory - a function which, when called, produces a new empty Map of the desired type

mapFactory is second parameter. Word "empty" is very important. Implementation use created map to store arrays of long while iterating and then convert them to long. It works because of massive amount of casting inside.

talex
  • 17,973
  • 3
  • 29
  • 66
  • 2
    Don’t do that thing with `new Object[]{0L}`. It will break as soon as you do either, use an alternative JRE or update to Java 9. In Java 9, it will use a `long[]` instead, avoiding the boxing overhead while counting, but other implementations may use even other temporary container types… – Holger Nov 02 '16 at 18:19
  • @Holger can't agree more. It is bad idea to use my solution in practice. – talex Nov 02 '16 at 18:22
  • It was purely played joke. Also there are lot of explicit castings. From runtime perspective there is no difference, but from quality of code is bad design. I saw your link and agree that it explains situation better then my answer. – talex Nov 02 '16 at 21:22
0

Why not a simple loop?

private static Map<Character, Integer> count(String input) {
    Map<Character, Integer> result = new HashMap<>();
    result.put('A', 0);
    result.put('B', 0);
    for (Character c : input.toCharArray()) {
        result.put(c, result.get(c) + 1);
    }
    return result;
}
  • It is slower. Take two access to map per character (`get` + `put`) – talex Nov 02 '16 at 18:24
  • 4
    @talex: just replace `result.put(c, result.get(c) + 1);` by `result.merge(c, 1, Integer::sum);` and the problem is solved. It would be even more efficient when not pre-initializing the map with zeros, but do a `map.putIfAbsent('A', 0L); map.putIfAbsent('B', 0L);` afterwards, just like in the question. – Holger Nov 02 '16 at 18:28