1

Let's say I have two sets:

Set<String> set1 = new HashSet<>(Arrays.asList("a", "b", "c", "d", "e") );
Set<String> set2 = new HashSet<>(Arrays.asList("b", "c", "d", "e", "f") );

What is the easiest and best way, performance wise, to compare the two and get a List of the differences? Meaning I should get a list that contains "a" and "f". The tricky thing here is that the differences can occur in either of the list.

I could only make it work with for loops, but there has to be an easier way...

TheStranger
  • 1,387
  • 1
  • 13
  • 35
  • Well, the easier way is probably to use a library, e.g. Apache Commons Collection's `CollectionUtils.disjunction()`. But doing it yourself shouldn't be that hard or complex anyway, e.g. if performance isn't that important try creating a copy of each set, call `removeAll()` on each and pass the other original set and finally combine the 2, e.g. `copyOfS1.removeAll(set2); copyOfS2.removeAll(set1); copyOfS1.addAll(copyOfS2);`. – Thomas Feb 03 '22 at 07:34
  • Does this answer your question? [Union or intersection of Java Sets](https://stackoverflow.com/questions/51113134/union-or-intersection-of-java-sets) – fantaghirocco Feb 03 '22 at 07:35
  • @fantaghirocco I'm not trying to do a union or an intersection, I'm actually trying to do the opposite of an intersection. An intersection returns all the elements that are contained in two lists, that's not what I'm asking for. – TheStranger Feb 03 '22 at 07:52
  • @Thomas Performance is crucial, that's the problem. – TheStranger Feb 03 '22 at 07:56
  • Well, you should state so from the get-go then. You just asked for "easier". – Thomas Feb 03 '22 at 07:57
  • @Thomas Okay, my bad. I've edited. – TheStranger Feb 03 '22 at 07:59

5 Answers5

1

Using streams it can be done with Collectors.groupingBy and Collectors.counting() to determine the count of occurrences and then filter any string that appears more than once:

List<String> differences = Stream.concat(set1.stream(), set2.stream())
                .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))  // Map<String, Long> -> key -> element, value -> count of occurrences
                .entrySet().stream()
                .filter(e -> e.getValue() == 1) // filter not unique elements from 2 sets
                .map(Map.Entry::getKey)
                .collect(Collectors.toList());
Georgii Lvov
  • 1,771
  • 1
  • 3
  • 17
0

something like this?

    public static void main(String[] args) {
        
        Set<String> set1 = new HashSet<>(Arrays.asList("a", "b", "c", "d", "e") );
        Set<String> set2 = new HashSet<>(Arrays.asList("b", "c", "d", "e", "f") );
        //get intersection array
        Set<String> listSame =  new HashSet<>(set2);
        listSame.retainAll(set1);
        System.out.println(listSame);
        //get union array
        Set<String> listDiff =  new HashSet<>(set1);
        listDiff.addAll(set2);
        // get difference
        listDiff.removeAll(listSame);
        System.out.println(listDiff);
    }

The best appreach is probably ro use unions and intersections, then getting the difference.

The Blind Hawk
  • 1,199
  • 8
  • 24
  • 1
    You do realize that due to `Set listSame = set2;` the operation `listSame.retainAll(set1);` will change `set2` as well, don't you? Hence ` listDiff.addAll(set2);` and `listDiff.removeAll(listSame);` don't make much sense - you first add all the elements in `set2` and then remove them (which would also remove those that were already present in `set1`. – Thomas Feb 03 '22 at 08:11
  • my bad, i only checked the output . I will adjust it now. – The Blind Hawk Feb 03 '22 at 08:44
0

One approach to do this is to create a set of one collection, iterate over the other and check if the element is contained while removing it:

Collection<String> c1 = Arrays.asList("a", "b", "c", "d", "e");
Collection<String> c2 = Arrays.asList("b", "c", "d", "e", "f");

Set<String> set = new HashSet<>(c1); //one loop over c1 to create the set - O(n)
List<String> differences = new LinkedList<>();
    
//one loop over c2 - O(m)
for( String e : c2 ) {
   boolean removed = set.remove(e); //should be O(1) due to hashset
   //if the element wasn't removed it was not in the set and thus not in c1
   if( !removed ) { 
       differences.add(e); //should be O(1) due to linked list
   }
}
    
//at this point set only contains elements not in c2 as those have been removed by the loop
differences.addAll(set); //at most O(n) if nothing got removed

As you can see, we have 2 O(n) and 1 O(m) operation so the total time complexity is O(n + m).

Thomas
  • 87,414
  • 12
  • 119
  • 157
0

I respond late but think the code is clearer to understand.

Here you can see how to both do union and intersection with Java Sets: https://stackoverflow.com/a/51113135/4222206

Coming back to your question for a difference in both sets:

  • create the union and the intersection
  • subtract the intersection from the union

In code that means:

  Set union = new Set(set1);
  union.addAll(set2);
  Set intersection = new Set(set1);
  intersection.retainAll(set2);
  
  Set result = new Set(union);
  result.removeAll(intersection);

Now in result you have everything that the two sets did not have in common and it should contain [a, f].

Queeg
  • 7,748
  • 1
  • 16
  • 42
-4

enter image description here

Please follow the screenshot i have sent I think this is the easiest way using java 8 streams

  • 1
    Don't post images of code. Although this refers to questions it also applies to answers: https://meta.stackoverflow.com/questions/285551/why-not-upload-images-of-code-errors-when-asking-a-question – Thomas Feb 03 '22 at 08:00
  • Team I will take care – Upendra Joshi Feb 03 '22 at 08:05