12

I have a map Map<K, V> and my goal is to remove the duplicated values and output the very same structure Map<K, V> again. In case the duplicated value is found, there must be selected one key (k) from the two keys (k1 and k2) which hold these values, for this reason, assume the BinaryOperator<K> giving k from k1 and k2 is available.

Example input and output:

// Input
Map<Integer, String> map = new HashMap<>();
map.put(1, "apple");
map.put(5, "apple");
map.put(4, "orange");
map.put(3, "apple");
map.put(2, "orange");

// Output: {5=apple, 4=orange} // the key is the largest possible

My attempt using Stream::collect(Supplier, BiConsumer, BiConsumer) is a bit very clumsy and contains mutable operations such as Map::put and Map::remove which I would like to avoid:

// // the key is the largest integer possible (following the example above)
final BinaryOperator<K> reducingKeysBinaryOperator = (k1, k2) -> k1 > k2 ? k1 : k2;

Map<K, V> distinctValuesMap = map.entrySet().stream().collect(
    HashMap::new,                                                              // A new map to return (supplier)
    (map, entry) -> {                                                          // Accumulator
        final K key = entry.getKey();
        final V value = entry.getValue();
        final Entry<K, V> editedEntry = Optional.of(map)                       // New edited Value
            .filter(HashMap::isEmpty)
            .map(m -> new SimpleEntry<>(key, value))                           // If a first entry, use it
            .orElseGet(() -> map.entrySet()                                    // otherwise check for a duplicate
                    .stream() 
                    .filter(e -> value.equals(e.getValue()))
                    .findFirst()
                    .map(e -> new SimpleEntry<>(                               // .. if found, replace
                            reducingKeysBinaryOperator.apply(e.getKey(), key), 
                            map.remove(e.getKey())))
                    .orElse(new SimpleEntry<>(key, value)));                   // .. or else leave
        map.put(editedEntry.getKey(), editedEntry.getValue());                 // put it to the map
    },
    (m1, m2) -> {}                                                             // Combiner
);

Is there a solution using an appropriate combination of Collectors within one Stream::collect call (e.g. without mutable operations)?

Nikolas Charalambidis
  • 40,893
  • 16
  • 117
  • 183

5 Answers5

11

You can use Collectors.toMap

private Map<Integer, String> deduplicateValues(Map<Integer, String> map) {
    Map<String, Integer> inverse = map.entrySet().stream().collect(toMap(
            Map.Entry::getValue,
            Map.Entry::getKey,
            Math::max) // take the highest key on duplicate values
    );

    return inverse.entrySet().stream().collect(toMap(Map.Entry::getValue, Map.Entry::getKey));
}
MikeFHay
  • 8,562
  • 4
  • 31
  • 52
8

Try this: Simple way is inverse the key and value then use toMap() collector with merge function.

map.entrySet().stream()
        .map(entry -> new AbstractMap.SimpleEntry<>(entry.getValue(), entry.getKey()))
        .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue, reducingKeysBinaryOperator));

Map<K, V> output = map.entrySet().stream()
        .collect(Collectors.toMap(Map.Entry::getValue, Map.Entry::getKey, reducingKeysBinaryOperator))
        .entrySet().stream()
        .collect(Collectors.toMap(Map.Entry::getValue, Map.Entry::getKey));
Naman
  • 27,789
  • 26
  • 218
  • 353
Hadi J
  • 16,989
  • 4
  • 36
  • 62
  • 2
    I fail to see what the intermediate `map` operation buys. You seem to swap keys and values, that much is clear, but what is the point, you could do that at the collect step just the same ? – GPI Jan 06 '20 at 14:06
  • 3
    @GPI and Michael, this is because he has to merge the keys, so invert the pairs will merge the keys. What is missing is the second inversion then. – Jean-Baptiste Yunès Jan 06 '20 at 14:14
  • 2
    @HadiJ No! Inversion was correct! but a second one was necessary to get back. Merge is used to merge the keys, but merging is only possible for values... – Jean-Baptiste Yunès Jan 06 '20 at 14:14
  • @Jean-BaptisteYunès I understand the need to merge, but why I don't immediatly get is why you code `swap(); collect(key, value, binOp);`instead of `collect(value, key, binOp)`. Maybe I need to try this in a jshell for real ? – GPI Jan 06 '20 at 14:33
  • 2
    Took the liberty to use the local variable introduced in the question in the code-shared by you. Do revert in case it conflicts the intention while you were making the answer. – Naman Jan 06 '20 at 16:00
  • Simple and straightforward! Works like charm :) Is it possible to achieve the very same result with no process-collect-process pipeline (within one `Stream`)? – Nikolas Charalambidis Jan 06 '20 at 18:39
4

I find the non-streams solution more expressive:

BinaryOperator<K> reducingKeysBinaryOperator = (k1, k2) -> k1 > k2 ? k1 : k2;

Map<V, K> reverse = new LinkedHashMap<>(map.size());
map.forEach((k, v) -> reverse.merge(v, k, reducingKeysBinaryOperator));

Map<K, V> result = new LinkedHashMap<>(reverse.size());
reverse.forEach((v, k) -> result.put(k, v));

This uses Map.merge with your reducing bi-function and uses LinkedHashMap to preserve original entries order.

fps
  • 33,623
  • 8
  • 55
  • 110
1

I found a way of using only Collectors with no need for collecting and further processing the returned Map again. The idea is:

  1. Group the Map<K, V> to Map<V, List<K>.

    Map<K, V> distinctValuesMap = this.stream.collect(
        Collectors.collectingAndThen(
            Collectors.groupingBy(Entry::getValue),
            groupingDownstream 
        )
    );
    

    {apple=[1, 5, 3], orange=[4, 2]}

  2. Reduce the new keys (List<K>) to K using BinaryOperator<K>.

    Function<Entry<V, List<Entry<K, V>>>, K> keyMapFunction = e -> e.getValue().stream()
        .map(Entry::getKey)
        .collect(Collectors.collectingAndThen(
            Collectors.reducing(reducingKeysBinaryOperator),
            Optional::get
        )
    );
    

    {apple=5, orange=4}

  3. Inverse the Map<V, K> back to Map<K, V> structure again - which is safe since both keys and values are guaranteed as distinct.

    Function<Map<V, List<Entry<K,V>>>, Map<K, V>> groupingDownstream = m -> m.entrySet()
        .stream()
        .collect(Collectors.toMap(
            keyMapFunction,
            Entry::getKey
        )
    );
    

    {5=apple, 4=orange}

The final code:

final BinaryOperator<K> reducingKeysBinaryOperator = ...

final Map<K, V> distinctValuesMap = map.entrySet().stream().collect(
        Collectors.collectingAndThen(
            Collectors.groupingBy(Entry::getValue),
            m -> m.entrySet().stream().collect(
                Collectors.toMap(
                    e -> e.getValue().stream().map(Entry::getKey).collect(
                        Collectors.collectingAndThen(
                            Collectors.reducing(reducingKeysBinaryOperator),
                            Optional::get
                        )
                    ),
                    Entry::getKey
                )
            )
        )
    );
Nikolas Charalambidis
  • 40,893
  • 16
  • 117
  • 183
1

Another approch to get the desired result with "Stream and Collectors.groupingBy".

    map = map.entrySet().stream()
    .collect(Collectors.groupingBy(
            Entry::getValue,
            Collectors.maxBy(Comparator.comparing(Entry::getKey))
            )
    )
    .entrySet().stream()
    .collect(Collectors.toMap(
            k -> {
                return k.getValue().get().getKey();
            }, 
            Entry::getKey));
Vishesh Chandra
  • 6,951
  • 6
  • 35
  • 38