18

Is there a better, simpler approach to this problem?

@Test
public void testReduce() {
    Set<Integer> foo = ImmutableSet.of(1,2,3,4,8,9);
    Set<Integer> bar = ImmutableSet.of(1,3,8,5,11);

    //DO think about solution for 1..n sets, and not only two.
    Set<Integer> intersection = ImmutableList.of(foo,bar)
            .stream()
            .reduce( null, (a, b) -> {
                if ( a == null ) {
                    a = new HashSet<Integer>(b);
                }
                else {
                    a.retainAll(b);
                }
                return a;
            });
    assertThat( intersection, is( ImmutableSet.of( 1,3,8) ) );
}
Lii
  • 11,553
  • 8
  • 64
  • 88
Gábor Lipták
  • 9,646
  • 2
  • 59
  • 113

5 Answers5

19

reduce is the wrong method for this, as you are not allowed to modify the function’s argument this way. This is a mutable reduction, also known as collect:

List<Set<Integer>> listOfSets=…;

if (listOfSets.isEmpty()) {
  return new HashSet<>();
}

Set<Integer> intersection = listOfSets.stream().skip(1)
    .collect(()->new HashSet<>(listOfSets.get(0)), Set::retainAll, Set::retainAll);

Having to peek for the first set is a drawback here, but using null as identity value isn’t clean either (and wouldn’t work with collect as the accumulator can’t return a new set).

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
Holger
  • 285,553
  • 42
  • 434
  • 765
  • Wow. Nice solution. One question: we could skip the first element from the stream. Is it possible? – Gábor Lipták Jul 08 '16 at 12:41
  • Since you're taking the intersection, `skip(1)` is not really necessary but slightly more performant. For a larger stream the difference becomes negligible unless the first set has many more elements than the other sets. – herman May 25 '18 at 08:41
3

The following will work if you use Eclipse Collections:

@Test
public void testReduce()
{
    ImmutableSet<Integer> foo = Sets.immutable.of(1, 2, 3, 4, 8, 9);
    ImmutableSet<Integer> bar = Sets.immutable.of(1, 3, 8, 5, 11);

    // Works with Eclipse Collections 7.0 or above
    ImmutableSet<Integer> intersection1 = Lists.mutable.of(foo, bar)
            .stream()
            .reduce(ImmutableSet::intersect).get();
    Assert.assertEquals(intersection1, Sets.immutable.of(1, 3, 8));

    // Works with Eclipse Collections 8.0.0-M1 or above
    ImmutableSet<Integer> intersection2 = Lists.immutable.of(foo, bar)
            .reduce(ImmutableSet::intersect).get();
    Assert.assertEquals(intersection2, Sets.immutable.of(1, 3, 8));
}

This can also work with MutableSet.

@Test
public void testReduce()
{
    MutableSet<Integer> foo = Sets.mutable.of(1, 2, 3, 4, 8, 9);
    MutableSet<Integer> bar = Sets.mutable.of(1, 3, 8, 5, 11);

    // Works with Eclipse Collections 7.0 or above
    MutableSet<Integer> intersection1 = Lists.mutable.of(foo, bar)
            .stream()
            .reduce(MutableSet::intersect).get();
    Assert.assertEquals(intersection1, Sets.immutable.of(1, 3, 8));

    // Works with Eclipse Collections 8.0.0-M1 or above
    MutableSet<Integer> intersection2 = Lists.immutable.of(foo, bar)
            .reduce(MutableSet::intersect).get();
    Assert.assertEquals(intersection2, Sets.immutable.of(1, 3, 8));
}

In Eclipse Collections, ImmutableSet does not extend java.util.Set, as Set is a mutable interface. MutableSet does extend java.util.Set. This design choice is explained in the answer to this question.

Note: I am a committer for Eclipse Collections.

Community
  • 1
  • 1
Donald Raab
  • 6,458
  • 2
  • 36
  • 44
1

This answer is only applicable for sets of Integers, not generic sets. But for someone looking for speed, sometimes lists of integers are a good case for compressed bitmaps. You should check if your integers group nicely and in some cases you can win couple of orders of magnitude on speed if you do this (using com.googlecode.javaewah32, Apache 2.0 license):

    Set<Integer> foo = ImmutableSet.of(1,2,3,4,8,9);
    Set<Integer> bar = ImmutableSet.of(1,3,8,5,11);

    EWAHCompressedBitmap32 fooBitmap = new EWAHCompressedBitmap32();
    EWAHCompressedBitmap32 barBitmap = new EWAHCompressedBitmap32();

    //fill bitmaps
    foo.stream().forEach(fooBitmap::set);
    bar.stream().forEach(barBitmap::set);

    //fooBitmap.and(barBitmap) returns intersection of sets now. fast!
    ImmutableSet<Integer> intersection = ImmutableSet.<Integer>builder()
                                    .addAll(fooBitmap.and(barBitmap))
                                    .build();

    System.out.println(intersection);

This code is just an example. You might/should use a different approach to convert to resulting set. EWAHCompressedBitmap32 is Iterable<Integer> so, there is no limit to imagination.

Now, the code above just intersects 2 sets. To intersect all sets in the list, you can do the usual reduce:

    Set<Integer> foo = ImmutableSet.of(1,2,3,4,8,9);
    Set<Integer> bar = ImmutableSet.of(1,3,8,5,11);

    List<Set<Integer>> sets = ImmutableList.of(foo,bar);

    EWAHCompressedBitmap32 res = sets.stream().map(l -> {
        EWAHCompressedBitmap32 b = new EWAHCompressedBitmap32();
        l.stream().forEach(b::set);
        return b;
    }).reduce(null, (l, r) -> l == null ? r : l.and(r));

    System.out.println(res);

Yet another alternative is to use reducing collector:

EWAHCompressedBitmap32 res = sets.stream().collect(Collectors.reducing(
      //identity
      null, 
      //mapper set -> compressedBitmap
      l -> { 
          EWAHCompressedBitmap32 b = new EWAHCompressedBitmap32();
          l.stream().forEach(b::set);
          return b;
      },
      //and-reducer 
      (l, r) -> l == null ? r : l.and(r) 
 ));
Alex Pakka
  • 9,466
  • 3
  • 45
  • 69
0

You can filter the elements of foo by checking if they are in the bar or otherBar set:

Set<Integer> set =
    foo.stream().filter(e -> Stream.of(bar, otherBar).allMatch(s -> s.contains(e))).collect(toSet());

For instance if you receive a Collection<Set<T>> as you stated:

public static <T> Set<T> intersection(Collection<Set<T>> input) {
    if(input.isEmpty()) {
        return Collections.emptySet();
    } else {
        Set<T> first = input.iterator().next();
        //if the collection allows to remove element, you can remove first before
        return first.stream().filter(e -> input.stream().allMatch(s -> s.contains(e))).collect(toSet());
    }
}
Alexis C.
  • 91,686
  • 21
  • 171
  • 177
  • In the example I have 2 sets. The solution should work for any number of elements, and should avoid creating a new Set each time the reducer runs. – Gábor Lipták Jul 08 '16 at 12:13
0

There is a shorter way to achieve that using Google Guava and Apache Commons:

Set<Integer> intersection = Stream.of(foo,bar)
                                  .reduce(Sets::intersection)
                                  .orElse(SetUtils.emptySet());