10

I want to ensure that all numbers in the list are grouped together. Let me explain this on examples:

{1, 1, 1, 2, 2}    // OK, two distinct groups
{1, 1, 2, 2, 1, 1} // Bad, two groups with "1"
{1, 2, 3, 4}       // OK, 4 distinct groups of size 1
{1, 1, 1, 1}       // OK, 1 group
{3, 4, 3}          // Bad, two groups with "3"
{99, -99, 99}      // Bad, two groups with "99"
{}                 // OK, no groups

Here's how I obtain the stream:

IntStream.of(numbers)
    ...

Now I need to pass or return true for "OK" examples and throw AssertionError or return false on "Bad" examples. How can I do that using Stream API?

Here's my current solution with additional Set created:

Set<Integer> previousNumbers = new HashSet<>();
IntStream.of(numbers)
        .reduce(null, (previousNumber, currentNumber) -> {
                    if (currentNumber == previousNumber) {
                        assertThat(previousNumbers).doesNotContain(currentNumber);
                        previousNumbers.add(currentNumber);
                    }
                    return currentNumber;
                }
        );
Michal Kordas
  • 10,475
  • 7
  • 58
  • 103
  • 3
    Your solution is not a correct one. It may work, given the current implementation (and obviously assuming sequential execution), but the function clearly violates the associativity requirement. Unfortunately, there is no simple solution without 3rd party help… – Holger Feb 05 '16 at 09:59
  • @Holger can you explain what is "associativity requirement"? – Michal Kordas Feb 05 '16 at 10:19
  • 4
    @MichalKordas, see the [documentation](https://docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.html#reduce-T-java.util.function.BinaryOperator-): the accumulator must be associative by the specification. – Tagir Valeev Feb 05 '16 at 11:28
  • 3
    [`(a op b) op c == a op (b op c)`](https://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html#Associativity) – Holger Feb 05 '16 at 11:53

3 Answers3

6

Using my free StreamEx library:

IntStreamEx.of(numbers).boxed().runLengths().toMap();

This code will throw IllegalStateException if there are repeating groups.

Here runLengths() method is used. It collapses equal adjacent elements replacing them with Map.Entry where key is the input element and value is the number of repeats. Finally toMap() is used which is a shortcut for .collect(Collectors.toMap(Entry::getKey, Entry::getValue)). We are using the fact that .toMap() throws IllegalStateException when keys repeat (unless custom mergeFunction is supplied).

As a free bonus on successful execution you will have a map where keys are input elements and values are lengths of series.

Tagir Valeev
  • 97,161
  • 19
  • 222
  • 334
5

In my opinion this problem doesn't fit the Stream API at all but I was curious how this could be implemenented (however in a performant way).

The problem is that you have to keep track of seen elements and the whole test should have a short-circuit behaviour. So I came up with this solution (without Streams):

public static boolean hasUniqueGroups(int[] arr) {
    Objects.requireNonNull(arr);
    Set<Integer> seen = new HashSet<>();
    for (int i = 0; i < arr.length; i++) {
        if (i == 0 || arr[i] != arr[i - 1]) {
            if (!seen.add(arr[i])) {
                return false;
            }
        }
    }
    return true;
}

The next step is to introduce the Stream API and the solution looks like this:

public static boolean hasUniqueGroups(int[] arr) {
    Objects.requireNonNull(arr);
    Set<Integer> seen = new HashSet<>();
    return IntStream.range(0, arr.length)
            .filter(i -> i == 0 || arr[i] != arr[i - 1])
            .mapToObj(i -> arr[i])
            .allMatch(seen::add);
}

Note: In order to parallelize this Stream you should use a thread-safe Set.

Flown
  • 11,480
  • 3
  • 45
  • 62
  • 2
    Nice, +1. The key insight here is that the beginning of a group is detected by the predicate `arr[i] != arr[i-1]`. For more general problems I'd have used a collector to generate the result, but for this specific case using `allMatch(seen::add)` is quite clever. As an aside, the name `hasMultipleGroups` has the wrong sense; perhaps `hasUniqueGroups` would be better? – Stuart Marks Feb 06 '16 at 18:21
  • 3
    @StuartMarks using a `Collector` was my first attempt but it doesn't have a short-circuit behaviour. Therefore it is not applicable to this problem. – Flown Feb 07 '16 at 00:25
1

More of an addition to what has been said already, we could try to answer this question using collect method. The problem with this approach (as others have indicated) is that a reduction operations do not terminate quickly.

Generally, to short-circuit a long reduction operation, we can short-circuit the reduction function. This way, although we still iterate through all items in the stream, the amount of work required is minimal.

public static boolean hasUniqueGroups(int... arr) {
    return !IntStream
        .of(arr) 
        .collect(
                Container::new, // 1
                (container, current) -> {
                    if (container.skip) return; // 2
                    if (current != container.previous) {
                        container.previous = current;
                        if (!container.integers.add(current))
                            container.skip = true; // 3
                    }
                },
                (c1, c2) -> {
                    if (c1.skip != c2.skip) {
                        c1.skip = true;
                        c1.integers.addAll(c2.integers);
                    }
                }
        )
        .skip;
}

private static class Container {
    private int previous = MAX_VALUE; // 4
    private boolean skip = false;
    private Set<Integer> integers = new HashSet<>();
}
  1. We create Supplier which will create new Container for each computation. Container (amongst other things) will hold information if we should continue or skip computation.
  2. If at some point we encountered non-unique group, we will skip the entire computation.
  3. If we are currently at the beginning of a new group, we check if it is unique. If not, we decide to skip the rest of the stream.
  4. This is a poor hack to solve the problem when we have sequence {0, 1, 0}. Of course, this solution will not work for i.e. {MAX_VALUE, 0, MAX_VALUE}. I decided to leave this problem for simplicity reason.

We can check the performance by replacing

IntStream.of(arr)

to

IntStream.concat(IntStream.of(1, 2), IntStream.range(1, Integer.MAX_VALUE))

which returns false. This of course will not work for infinite streams, but checking unique groups in infinite stream does not really make sense.