How can I get a list containing the differences between two sets?

Question

Let's say I have two sets:

Set<String> set1 = new HashSet<>(Arrays.asList("a", "b", "c", "d", "e") );
Set<String> set2 = new HashSet<>(Arrays.asList("b", "c", "d", "e", "f") );

What is the easiest and best way, performance wise, to compare the two and get a List of the differences? Meaning I should get a list that contains "a" and "f". The tricky thing here is that the differences can occur in either of the list.

I could only make it work with for loops, but there has to be an easier way...

Well, the easier way is probably to use a library, e.g. Apache Commons Collection's `CollectionUtils.disjunction()`. But doing it yourself shouldn't be that hard or complex anyway, e.g. if performance isn't that important try creating a copy of each set, call `removeAll()` on each and pass the other original set and finally combine the 2, e.g. `copyOfS1.removeAll(set2); copyOfS2.removeAll(set1); copyOfS1.addAll(copyOfS2);`. — Thomas, Feb 03 '22 at 07:34
Does this answer your question? [Union or intersection of Java Sets](https://stackoverflow.com/questions/51113134/union-or-intersection-of-java-sets) — fantaghirocco, Feb 03 '22 at 07:35
@fantaghirocco I'm not trying to do a union or an intersection, I'm actually trying to do the opposite of an intersection. An intersection returns all the elements that are contained in two lists, that's not what I'm asking for. — TheStranger, Feb 03 '22 at 07:52
Well, you should state so from the get-go then. You just asked for "easier". — Thomas, Feb 03 '22 at 07:57

score 1 · Accepted Answer · answered Feb 03 '22 at 08:17

1

Using streams it can be done with Collectors.groupingBy and Collectors.counting() to determine the count of occurrences and then filter any string that appears more than once:

List<String> differences = Stream.concat(set1.stream(), set2.stream())
                .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))  // Map<String, Long> -> key -> element, value -> count of occurrences
                .entrySet().stream()
                .filter(e -> e.getValue() == 1) // filter not unique elements from 2 sets
                .map(Map.Entry::getKey)
                .collect(Collectors.toList());

answered Feb 03 '22 at 08:17

Georgii Lvov

1,771
1
3
17

Okay that makes sense. will this also work if I make it a `Set` instead of a `List`, since the Set can never contain duplicates? – TheStranger Feb 03 '22 at 08:30
1

@Ben yes, it will work, use then `Collectors.toSet()` – Georgii Lvov Feb 03 '22 at 08:35

The Blind Hawk · Answer 2 · 2022-02-03T08:44:49.923

0

something like this?

    public static void main(String[] args) {
        
        Set<String> set1 = new HashSet<>(Arrays.asList("a", "b", "c", "d", "e") );
        Set<String> set2 = new HashSet<>(Arrays.asList("b", "c", "d", "e", "f") );
        //get intersection array
        Set<String> listSame =  new HashSet<>(set2);
        listSame.retainAll(set1);
        System.out.println(listSame);
        //get union array
        Set<String> listDiff =  new HashSet<>(set1);
        listDiff.addAll(set2);
        // get difference
        listDiff.removeAll(listSame);
        System.out.println(listDiff);
    }

The best appreach is probably ro use unions and intersections, then getting the difference.

edited Feb 03 '22 at 08:44

answered Feb 03 '22 at 08:02

The Blind Hawk

1,199
8
24

1

You do realize that due to `Set listSame = set2;` the operation `listSame.retainAll(set1);` will change `set2` as well, don't you? Hence ` listDiff.addAll(set2);` and `listDiff.removeAll(listSame);` don't make much sense - you first add all the elements in `set2` and then remove them (which would also remove those that were already present in `set1`. – Thomas Feb 03 '22 at 08:11
my bad, i only checked the output . I will adjust it now. – The Blind Hawk Feb 03 '22 at 08:44

Thomas · Answer 3 · 2022-02-03T08:15:19.517

One approach to do this is to create a set of one collection, iterate over the other and check if the element is contained while removing it:

Collection<String> c1 = Arrays.asList("a", "b", "c", "d", "e");
Collection<String> c2 = Arrays.asList("b", "c", "d", "e", "f");

Set<String> set = new HashSet<>(c1); //one loop over c1 to create the set - O(n)
List<String> differences = new LinkedList<>();
    
//one loop over c2 - O(m)
for( String e : c2 ) {
   boolean removed = set.remove(e); //should be O(1) due to hashset
   //if the element wasn't removed it was not in the set and thus not in c1
   if( !removed ) { 
       differences.add(e); //should be O(1) due to linked list
   }
}
    
//at this point set only contains elements not in c2 as those have been removed by the loop
differences.addAll(set); //at most O(n) if nothing got removed

As you can see, we have 2 O(n) and 1 O(m) operation so the total time complexity is O(n + m).

Queeg · Answer 4 · 2022-02-03T10:54:36.323

I respond late but think the code is clearer to understand.

Here you can see how to both do union and intersection with Java Sets: https://stackoverflow.com/a/51113135/4222206

Coming back to your question for a difference in both sets:

create the union and the intersection
subtract the intersection from the union

In code that means:

  Set union = new Set(set1);
  union.addAll(set2);
  Set intersection = new Set(set1);
  intersection.retainAll(set2);
  
  Set result = new Set(union);
  result.removeAll(intersection);

Now in result you have everything that the two sets did not have in common and it should contain [a, f].

score -4 · Answer 5 · answered Feb 03 '22 at 07:55

-4

enter image description here

Please follow the screenshot i have sent I think this is the easiest way using java 8 streams

answered Feb 03 '22 at 07:55

Upendra Joshi

19
4

1

Don't post images of code. Although this refers to questions it also applies to answers: https://meta.stackoverflow.com/questions/285551/why-not-upload-images-of-code-errors-when-asking-a-question – Thomas Feb 03 '22 at 08:00
Team I will take care – Upendra Joshi Feb 03 '22 at 08:05

How can I get a list containing the differences between two sets?

5 Answers5