1

As I couldn't find anything related to this, I am wondering if streams even allow this.

In my answer to another question, I have following code to add elements to a result list, only if the result list doesn't already contain it:

List<Entry<List<Integer>, Integer>> list = new ArrayList<>(diffMap.entrySet());
list.sort(Entry.comparingByValue());
List<List<Integer>> resultList = new ArrayList<>();
for (Entry<List<Integer>, Integer> entry2 : list) {
    if (!checkResultContainsElement(resultList, entry2.getKey()))
        resultList.add(entry2.getKey());
}

checkResultContainsElement method:

private static boolean checkResultContainsElement(List<List<Integer>> resultList, List<Integer> key) {
    List<Integer> vals = resultList.stream().flatMap(e -> e.stream().map(e2 -> e2))
            .collect(Collectors.toList());
    return key.stream().map(e -> e).anyMatch(e -> vals.contains(e));
}

Now I am wondering, if this for-loop:

for (Entry<List<Integer>, Integer> entry2 : list) {
    if (!checkResultContainsElement(resultList, entry2.getKey()))
        resultList.add(entry2.getKey());
}

can be realized using streams. I don't think that .filter() method would work, as it would remove data from List<Entry<List<Integer>, Integer>> list while I don't even know if an element should be considered. I guess that a custom collector could work, but I also wouldn't know how to implement one, as the result is constantly changing with each newly added element.

I am looking for something like this (can be different if something else is better):

list.stream().sorted(Entry.comparingByValue()).collect(???);

where ??? would filter the data and return it as a list.


The values of one result list may not be contained in another one. So these lists are valid:

[1, 2, 3, 4]
[5, 6, 7, 8]
[12, 12, 12, 12]

but of these, only the first is valid:

[1, 2, 3, 4] <-- valid
[5, 3, 7, 8] <-- invalid: 3 already exists
[12, 12, 2, 12] <-- invalid: 2 already exists
XtremeBaumer
  • 6,275
  • 3
  • 19
  • 65
  • Does [the `distinct` method of `Stream`](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/stream/Stream.html#distinct()) give you what you want? – Ole V.V. May 16 '22 at 11:20
  • 1
    @OleV.V. it does not. I don't just want distinct elements. Assuming a `List>` where each nested list can have the same elements in a different order, I need to check if the resulting list would already contain any of the integers, no matter the position. For `[1,3,5]` and `[3, 7, 9]` the `distinct()´ method would return both, while only 1 of both is valid – XtremeBaumer May 16 '22 at 11:30
  • Please provide the question with the structure of the `Foo` class. It's not obvious from the first glance that it contains a field `value` of type `Integer`. – Alexander Ivanchenko May 16 '22 at 12:03
  • @AlexanderIvanchenko converted question from `Foo` to `Integer`, as the type doesn't matter to the question itself – XtremeBaumer May 16 '22 at 12:22
  • That makes a difference in how to establish the uniqueness of lists. Since there's already an answer posted by @AshishPatil, maybe it might make sense to rollback the version of the question? (*the statement of the problem is totally up to you, it's only a proposition*) – Alexander Ivanchenko May 16 '22 at 12:32
  • A result list is considered unique, only if none if its elements are already existing in any previously returned result list – XtremeBaumer May 16 '22 at 12:49

2 Answers2

1

May be something like this:-

 list.stream().
sorted(Entry.comparingByValue()).
collect(ArrayList<List<Foo>>::new,(x,y)->!checkResultContainsElement(x, y.getKey()),(x,y)->x.add(y.getKey()));
Ashish Patil
  • 4,428
  • 1
  • 15
  • 36
  • 1
    Had to make some adjustments for type errors, but ultimately it seems to work as needed: `return list.stream().sorted(Entry.comparingByValue()).collect(ArrayList::new, (x, y) -> { if (!checkResultContainsElement(x, y.getKey())) x.add(y.getKey()); }, (x, y) -> x.addAll(y));` – XtremeBaumer May 16 '22 at 11:44
  • glad to know, it worked for you! – Ashish Patil May 16 '22 at 12:04
1

If we put aside for a moment the details on whether implementation will be stream-based or not, the existing implementation of how uniqueness of the values of incoming lists is being checked can be improved.

We can gain a significant performance improvement by maintaining a Set of previously encountered values.

I.e. values from each list that was added to the resulting list would be stored in a set. And in order to ensure uniqueness of every incoming list, its values would be checked against the set.

Since operations of a stream pipeline should be stateless, as well as collector shouldn't hold a state (i.e. changes should happen only inside its mutable container). We can approach this problem by defining a container that will encompass a resulting list of lists of Foo and a set of foo-values.

I've implemented this container as a Java 16 record:

public record FooContainer(Set<Integer> fooValues, List<List<Foo>> foosList) {
    public void tryAdd(List<Foo> foos) {
        if (!hasValue(foos)) {
            foos.forEach(foo -> fooValues.add(foo.getValue()));
            foosList.add(foos);
        }
    }
    
    public boolean hasValue(List<Foo> foos) {
        return foos.stream().map(Foo::getValue).anyMatch(fooValues::contains);
    }
}

The record shown above would is used as a mutable container of a custom collector created with Colloctors.of(). Collector's accumulator make's use of tryAdd() method defined by the container. And the finisher extracts the resulting list from the container.

Note that this operation is not parallelizable, hence combiner of the collector throws an AssertionError.

public static void main(String[] args) {
    Map<List<Foo>, Integer> diffMap =
        Map.of(List.of(new Foo(1), new Foo(2), new Foo(3)), 1,
               List.of(new Foo(1), new Foo(4), new Foo(5)), 2,
               List.of(new Foo(7), new Foo(8), new Foo(9)), 3);
    
    List<List<Foo>> result = diffMap.entrySet().stream()
        .sorted(Map.Entry.comparingByValue())
        .map(Map.Entry::getKey)
        .collect(Collector.of(
            () -> new FooContainer(new HashSet<>(), new ArrayList<>()),
            FooContainer::tryAdd,
            (left, right) -> {throw new AssertionError("The operation isn't parallelizable");},
            FooContainer::foosList
        ));

    System.out.println(result);
}

Output:

[[Foo{1}, Foo{2}, Foo{3}], [Foo{7}, Foo{8}, Foo{9}]]
Alexander Ivanchenko
  • 25,667
  • 5
  • 22
  • 46