6

Practically I know ways to reduce duplicate trought distinct(), or assign List to Set, but I have a little different issue. How to solve smart way below problem in JAVA 8 using stream or may be StreamEx ?

Let's say we have a objects in List

A, A, A, B, B, A, A, A, C, C, C, A, A, B, B, A

Now I need

A, B, A, C, A, B, A

So duplicated was removed but only if appear as next, but should stay if next to then is different object. I tried a few solutions but ware ugly, and not readable.

Andrew Tobilko
  • 48,120
  • 14
  • 91
  • 142
Mbded
  • 1,754
  • 4
  • 23
  • 43
  • 2
    I may be wrong but streams may not be best tool since you need to store somewhere state which will inform us about previous value and if I remember correctly streams preferred to be stateless. Why not use simple loop? – Pshemo Mar 10 '18 at 19:28
  • 3
    You can do it with a stateful filter, but you shouldn't do that, because it'll fail if the stream is parallel. – Andreas Mar 10 '18 at 19:28
  • 4
    Your best option is likely to create your own `Collector`, so the duplicates can be removed as they are added to the result `List`. A better option is to not use streams. – Andreas Mar 10 '18 at 19:29
  • This really sounds like a hammer-nail problem. Use a `Set` and be done with it. :) – Stefan Haberl Jan 09 '19 at 09:52

6 Answers6

12

Option 1: Filter

You could write a stateful filter, but you should never do that, because it violates the contract of filter(Predicate<? super T> predicate):

predicate - a non-interfering, stateless predicate to apply to each element to determine if it should be included

public class NoRepeatFilter<T> implements Predicate<T> {
    private T prevValue;
    @Override
    public boolean test(T value) {
        if (value.equals(this.prevValue))
            return false;
        this.prevValue = value;
        return true;
    }
}

Test

List<String> result = Stream
        .of("A", "A", "A", "B", "B", "A", "A", "A", "C", "C", "C", "A", "A", "B", "B", "A")
//      .parallel()
        .filter(new NoRepeatFilter<>())
        .collect(Collectors.toList());
System.out.println(result);

Output

[A, B, A, C, A, B, A]

The reason it must be stateless is that it'll fail if the stream is parallel, e.g. running test again with .parallel() uncommented:

[A, A, B, B, A, C, C, C, A, B, B, A]


Option 2: Collector

A valid solution is to create your own Collector using of(...):

public class NoRepeatCollector {
    public static <E> Collector<E, ?, List<E>> get() {
        return Collector.of(ArrayList::new,
                            NoRepeatCollector::addNoRepeat,
                            NoRepeatCollector::combineNoRepeat);
    }
    private static <E> void addNoRepeat(List<E> list, E value) {
        if (list.isEmpty() || ! list.get(list.size() - 1).equals(value))
            list.add(value);
    }
    private static <E> List<E> combineNoRepeat(List<E> left, List<E> right) {
        if (left.isEmpty())
            return right;
        if (! right.isEmpty())
            left.addAll(left.get(left.size() - 1).equals(right.get(0))
                        ? right.subList(1, right.size()) : right);
        return left;
    }
}

Test

List<String> result = Stream
        .of("A", "A", "A", "B", "B", "A", "A", "A", "C", "C", "C", "A", "A", "B", "B", "A")
//      .parallel()
        .collect(NoRepeatCollector.get());
System.out.println(result);

Output (with and without .parallel())

[A, B, A, C, A, B, A]


Option 3: Loop

If your input is a List (or other Iterable), you could remove repeating values using a simple loop:

public static <E> void removeRepeats(Iterable<E> iterable) {
    E prevValue = null;
    for (Iterator<E> iter = iterable.iterator(); iter.hasNext(); ) {
        E value = iter.next();
        if (value.equals(prevValue))
            iter.remove();
        else
            prevValue = value;
    }
}

Test

List<String> list = new ArrayList<>(Arrays.asList(
        "A", "A", "A", "B", "B", "A", "A", "A", "C", "C", "C", "A", "A", "B", "B", "A"));
removeRepeats(list);
System.out.println(list);

Output

[A, B, A, C, A, B, A]

Andreas
  • 154,647
  • 11
  • 152
  • 247
  • 3
    The stateful predicate is, well... stateful. It works if the stream is sequential, but I think these kind of solutions shouldn't be encouraged. The loop solution fails if the list/iterable contains a null element in the first position, otherwise it's fine. Finally, I upvoted due to the collector-based solution, which should be common knowledge among java developers by 2018. – fps Mar 11 '18 at 00:43
  • @Andrew Option 3 performs best, so use that if you can, otherwise option 2. As I already said in the answer, don't use option 1, but I included it because it is a common suggestion for your kind of problem, even though it's a bad suggestion. – Andreas Mar 11 '18 at 05:33
  • 1
    Thank You. For me best readable is solution with implements `Predicate` morethan is short, and clean. Yes this broken some rule, as You wrote, but in this situation a buy it, because for me the most important is readable. – Mbded Mar 11 '18 at 10:09
  • 1
    You only have to be aware that your stateful predicate may break with a parallel stream, as well as in a `flatMap` context and perhaps some more situations… – Holger Mar 12 '18 at 09:24
  • @Andreas - for the 1st object, we have prevValue = null. Why does Predicate.test(T t) not throw an exception for the 1st object since it has a null prevValue? – armani Mar 20 '20 at 01:32
  • 1
    @armani Because the documentation, i.e. the javadoc of [`equals()`](https://docs.oracle.com/javase/8/docs/api/java/lang/Object.html#equals-java.lang.Object-) say it's not allowed to throw NPE: *For any non-null reference value `x`, `x.equals(null)` should return `false`.* – Andreas Mar 20 '20 at 02:07
1

It's quite simple without using streams.. Something like this:

public List<T> noConsecutiveDuplicates(final List<T> input) {   
    final List<T> output = new ArrayList<>();
    for (final T element : input) {
        if (!element.equals(lastElement(output))) {
            output.add(element);
        }
    }
    return output;
}    

private T lastElement(final List<T> list) {
    if (list.size() == 0) {
        return null;
    }
    return list.get(list.size() - 1);
}
Tobb
  • 11,850
  • 6
  • 52
  • 77
1

I would give StreamEx a shot and use StreamEx::collapse:

List<String> strings = Arrays.asList("A", "A", "A", "B", "B", "A", "A", "A", "C", "C", "C", "A", "A", "B", "B", "A");

List<String> collect = StreamEx.of(strings)
        .collapse(Objects::equals)
        .collect(Collectors.toList());

It is also possible by using vanilla Java and utilize the idea of "edge detection":

List<String> collect = IntStream.range(0, strings.size())
        .filter(i -> i == 0 || !Objects.equals(strings.get(i - 1), strings.get(i)))
        .mapToObj(strings::get)
        .collect(Collectors.toList());
Flown
  • 11,480
  • 3
  • 45
  • 62
1
List<String> lst = Arrays.asList("A", "A", "A", "B", "B", "A", "A", "A", "C", "C", "C", "A", "A", "B", "B", "A");
       List<String> result = IntStream.range(0, lst.size())
      .filter(index->index ==0 || !lst.get(index).equals(lst.get(index-1)))
      .mapToObj(i->lst.get(i)).collect(Collectors.toList());

result.stream().forEach(System.out::print);

You can simply iterate over the indexes from the source of the data and filter those elements which are not same as previous element.

Chota Bheem
  • 1,106
  • 1
  • 13
  • 31
0

This may not be the cleanest solution, but you could use a filter, that remembers previous stream value.

class noDuplicateFilter implementsd Function<T>{
    private T previous=null;

    public boolean test(T input){

       boolean distinct= !Objects.equals(input, previous);
       this.previous = input;
       return distinct;
    }
}

Then use it inside of your stream.

Probably there is a bleaner solutino in JavaRx.

There are also some solutions here

Beri
  • 11,470
  • 4
  • 35
  • 57
0

I think the most concise way is to use the reduce method as below;

import java.util.ArrayList; 
import java.util.Arrays;
import java.util.List;
import java.util.Stack;
import java.util.function.BiFunction;
import java.util.function.BinaryOperator;

public class Main {
    public static void main(String[] args) {
        List<String> ss =Arrays.asList("A","A","A","B","B", "A","A","A", "C", "C", "C","A","A","B","B","A");
        BiFunction<ArrayList<String>, String, ArrayList<String>> acc = new BiFunction<ArrayList<String>, String, ArrayList<String>>() {
        @Override
        public ArrayList<String> apply(ArrayList<String> strings, String s) {
                if(strings.isEmpty() || !strings.get(strings.size()-1).equals(s)){
                    strings.add(s);
                }
                return strings;
            }
        };
        BinaryOperator<ArrayList<String>> combiner = new BinaryOperator<ArrayList<String>>() {
            @Override
            public ArrayList<String> apply(ArrayList<String> strings, ArrayList<String> strings2) {
                strings.addAll(strings2);
                return strings;
            }
        };
        ss.stream().reduce(new ArrayList<String>(), acc, combiner).forEach(System.out::println);
    }
}
dursun
  • 1,861
  • 2
  • 21
  • 38
  • 2
    The combiner is wrong. You should only combine the lists if the first element of the right list is not equal to the last element of the left list, else you should add the sublist of right that starts at its second element. (You should also check for emptiness of both lists). Besides, why aren't you using lambdas instead of anonymous inner classes? – fps Mar 10 '18 at 20:50
  • Yes you are right with all your comments, i just wanted to introduce another way, with causual coding, the point about combiner os that ot will never be used in this case, however in order to make our code clean it should be fixed as you said – dursun Mar 10 '18 at 21:10