7

I have a list of int arrays. I want to group that by unique arrays.

int[] array1 = new int[]{1, 2, 3};
int[] array2 = new int[]{1, 2, 3}; //array1 = array2 
int[] array3 = new int[]{0, 2, 3};

List<int[]> test = new ArrayList<>();

test.add(array1);
test.add(array2);
test.add(array3);

test.stream().collect(Collectors.groupingBy(Function.identity(), Collectors.counting())); 

Unfortunately, it doesn't work. It groups as if any array was unique:

{1, 2, 3} - 1
{1, 2, 3} - 1 
{0, 2, 3} - 1

I expect:

{1, 2, 3} - 2
{0, 2, 3} - 1

What can I do?

Stefan Zobel
  • 3,182
  • 7
  • 28
  • 38
Aleksandr
  • 83
  • 1
  • 4

4 Answers4

5

It groups as if any array was unique:

And it is the case. You would indeed have some difficulties to implement it whatever the way : built-in Collectors such as groupingBy() and toMap() or loop as two arrays with the same content are not equals in terms of equals() (and hashCode() too).
You should consider to use List<Integer> for this use case instead of int[].

For example :

    public static void main(String[] args) {
        int[] array1 = new int[] { 1, 2, 3 };
        int[] array2 = new int[] { 1, 2, 3 }; // array1 = array2
        int[] array3 = new int[] { 0, 2, 3 };

        List<List<Integer>> test = new ArrayList<>();

        test.add(Arrays.stream(array1)
                       .boxed()
                       .collect(Collectors.toList()));
        test.add(Arrays.stream(array2)
                       .boxed()
                       .collect(Collectors.toList()));
        test.add(Arrays.stream(array3)
                       .boxed()
                       .collect(Collectors.toList()));

        Map<List<Integer>, Long> map = test.stream()
                                           .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
        System.out.println(map);    
    }

Output :

{[0, 2, 3]=1, [1, 2, 3]=2}

davidxxx
  • 125,838
  • 23
  • 214
  • 215
1

Try the following:

Map<Integer, Long> res = test.stream().collect(Collectors.groupingBy(Arrays::hashCode, Collectors.counting()));

Please notice that in the map instead of actual array as the key you will have array hash code. If you want to have actual array as a key - you should wrap it in the class with equals/hashcode implemented based on array content.

Maciej
  • 1,954
  • 10
  • 14
  • 4
    grouping by hash code will probably work fine 99% the time, but bite you horribly once two non-identical arrays produce the same hashcode – Felk Mar 12 '18 at 12:01
  • Correct - that is just for illustration how the grouping works, not the definitive solution - especially that resulting map is not much useful (as it does not have original array) - thus suggestion of wrapper. – Maciej Mar 12 '18 at 12:10
1

You can use a list and java8 streams.

Map<List<Integer>, Long> mapList = Stream.of(array1, array2, array3)
        .map(Arrays::stream)
        .map(IntStream::boxed)
        .map(stream -> stream.collect(Collectors.toList()))
        .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

If you want the Map<int[], Long> you can continue after the collect() above like.

        // ... collect ...
        .entrySet().stream().collect(Collectors.toMap(entry -> entry.getKey()
                   .stream().mapToInt(i -> i).toArray(), Entry::getValue));
        // returns Map<int[], Long>

Still I think that the question emphasizes the use of arrays. You can create a wrapper class for the int[] object.

This is just an example for the int array wrapper, even though a more complex class can be used, with something like the factory pattern to allow the use of all array primitives and even array of objects.

public class IntArray {
    private final int[] array;
    private IntArray(final int[] array) {
        this.array = array;
    }
    public static IntArray wrap(final int[] array) {
        return new IntArray(array);
    }
    public int[] unwrap() {
        return array;
    }
    @Override
    public boolean equals(final Object obj) {
        return obj instanceof IntArray
                && Arrays.equals(((IntArray) obj).array, array);
    }
    @Override
    public int hashCode() {
        return Arrays.hashCode(array);
    }
    @Override
    public String toString() {
        return Arrays.toString(array);
    }
}

Here the wrap(..) method is optional, IntArray::new can be used instead, also the toString() method is optional to allow the conversion of the internal array to string without the need un unwrap.

The necessary methods are equals(..) and hashcode() because they are required for the map to work properly.

Here is more info about it.

Understanding the workings of equals and hashCode in a HashMap

Finally it can be used like.

Map<IntArray, Long> mapArray = Stream.of(array1, array2, array3)
        .map(IntArray::wrap)
        .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

And again if you want the array as key of the map (as a Map<int[], Long>) you can do the following (note that in this case, a new array is not created, it just uses the first unique array that it finds).

        // ... collect ...
        .entrySet().stream().collect(Collectors.toMap(entry -> entry.getKey()
                   .unwrap(), Entry::getValue));
        // returns Map<int[], Long>
Jose Da Silva Gomes
  • 3,814
  • 3
  • 24
  • 34
0

If you consider converting your int arrays to Lists then you could introduce custom logic inside collect as follows:

test
                        .parallelStream()
                        .map(array -> Arrays.stream(array)
                                .boxed()
                                .collect(Collectors.toList())
                        )
                        .collect(HashMap::new,
                                (HashMap<List<Integer>, Integer> map, List<Integer> list) -> {
                                    if (map.containsKey(list)) {
                                        map.put(list, map.get(list) + 1);
                                    } else {
                                        map.put(list, 1);
                                    }
                                },
                                (HashMap<List<Integer>, Integer> map1, HashMap<List<Integer>, Integer> map2) -> {
                                    map2.entrySet().forEach(entry -> {
                                        if (map1.containsKey(entry.getKey())) {
                                            map1.put(entry.getKey(), map1.get(entry.getKey()) + 1);
                                        } else {
                                            map1.put(entry.getKey(), 1);
                                        }
                                    });

                                }
                        )

To understand what is happening here see the definition of the collect method below:

<R> R collect(Supplier<R> supplier,
                  BiConsumer<R, ? super T> accumulator,
                  BiConsumer<R, R> combiner)

Here,

Supplier is basically the returned type.

accumulator is the accumulator bi consumer, notice how I have accumulated the results.

combiner only works if you are doing parallel execution. Hence, I combined the results of parallel execution such that our output remains same. You could simply think like we have used Divide and Conquer here.

This is rather very powerful when you have exhausted all your efforts on available Collectors and you have to write a custom solution.

Vinay Prajapati
  • 7,199
  • 9
  • 45
  • 86