81

Question is simple:

I have two List

List<String> columnsOld = DBUtils.GetColumns(db, TableName);
List<String> columnsNew = DBUtils.GetColumns(db, TableName);

And I need to get the intersection of these. Is there a quick way to achieve this?

Pentium10
  • 204,586
  • 122
  • 423
  • 502

9 Answers9

127

You can use retainAll method:

columnsOld.retainAll (columnsNew);
Paolo Fulgoni
  • 5,208
  • 3
  • 39
  • 55
Roman
  • 64,384
  • 92
  • 238
  • 332
  • 14
    Note: for this to work with other objects than `String`, you need of course to implement `equals` and `hashCode`. – Benoit Duffez Aug 16 '15 at 08:11
  • 1
    The code is simple but the algorithmic complexity is poor: O(n×m), versus O(n+m) for the [set version](https://stackoverflow.com/a/15684435/68587). With two million-item lists it's the difference between *trillions* of operations and *millions* of operations. – John Kugelman May 08 '20 at 11:34
  • if you use `retainAll` on Lists, it runs O(n^2) – razor Feb 12 '23 at 10:19
27

Using Google's Guava library:

Sets.intersection(Sets.newHashSet(setA), Sets.newHashSet(setB))

Note: This is much more efficient than naively doing the intersection with two lists: it's O(n+m), versus O(n×m) for the list version. With two million-item lists it's the difference between millions of operations and trillions of operations.

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
Sergii Shevchyk
  • 38,716
  • 12
  • 50
  • 61
20

Since retainAll won't touch the argument collection, this would be faster:

List<String> columnsOld = DBUtils.GetColumns(db, TableName); 
List<String> columnsNew = DBUtils.GetColumns(db, TableName); 

for(int i = columnsNew.size() - 1; i > -1; --i){
    String str = columnsNew.get(i);
    if(!columnsOld.remove(str))
        columnsNew.remove(str);
}

The intersection will be the values left in columnsNew. Removing already compared values fom columnsOld will reduce the number of comparisons needed.

bjornhol
  • 550
  • 2
  • 7
  • But your code definitly should be extracted to a new separate method because it's absolutely unclear from this code what does it do. And I also wouln't have refused an additional unit test for this code. – Roman Mar 08 '10 at 12:43
  • Agree, good method separation, naming and unit tests is always rule number one. – bjornhol Mar 08 '10 at 13:02
  • Shouldn't this method add the elements that can not be found in the columnsOld to the columnsNew? It looks like those elements will be missing in the result. – Calon Feb 18 '14 at 09:29
  • The optimization of removing columns from columnsOld might actually make no difference (the remove has itself a cost) or even be slower in cases like ArrayList where a remove shifts the elements. – Bogdan Calmac Feb 18 '14 at 21:17
8

How about

private List<String> intersect(List<String> A, List<String> B) {
    List<String> rtnList = new LinkedList<>();
    for(String dto : A) {
        if(B.contains(dto)) {
            rtnList.add(dto);
        }
    }
    return rtnList;
}
Gigas
  • 97
  • 1
  • 1
  • 7
    If B contains elements which are not contained in A, there is no need to iterate over those elements because we are trying to find all elements in both A and B. – juan2raid Feb 16 '15 at 17:10
  • This is O(n^2) ! you should use `contains` on a Set – razor Feb 12 '23 at 10:20
4

using retainAll if don't care occurrences, otherwise using N.intersection

a = N.asList(12, 16, 16, 17, 19);
b = N.asList(16, 19, 107);
a.retainAll(b); // [16, 16, 19]
N.println(a);

a = N.asList(12, 16, 16, 17, 19);
b = N.asList(16, 19, 107);
a = N.intersect(a, b);
N.println(a); // [16, 19]

N is an utility class in abacus-common

user_3380739
  • 1
  • 14
  • 14
3

There is a nice way with streams which can do this in one line of code and you can two lists which are not from the same type which is not possible with the containsAll method afaik:

columnsOld.stream().filter(c -> columnsNew.contains(c)).collect(Collectors.toList());

An example for lists with different types. If you have a realtion between foo and bar and you can get a bar-object from foo than you can modify your stream:

List<foo> fooList = new ArrayList<>(Arrays.asList(new foo(), new foo()));
List<bar> barList = new ArrayList<>(Arrays.asList(new bar(), new bar()));

fooList.stream().filter(f -> barList.contains(f.getBar()).collect(Collectors.toList());
Deutro
  • 3,113
  • 4
  • 18
  • 26
3

If you put the second list in a set say HashSet. And just iterate over the first list checking for presence on the set and removing if not present, your first list will eventually have the intersection you need. It will be way faster than retainAll or contains on a list. The emphasis here is to use a set instead of list. Lookups are O(1). firstList.retainAll (new HashSet (secondList)) will also work.

Ravi Sanwal
  • 584
  • 5
  • 14
1

use org.apache.commons.collections4.ListUtils#intersection

Dheeraj Sachan
  • 3,965
  • 2
  • 17
  • 18
0

With Java 8 Stream API (and Java 9 List.of()) you can do following:

List<Integer> list1 = List.of(1, 1, 2, 2);
List<Integer> list2 = List.of(2, 2, 3, 3);

List<Integer> intersection = list1.stream()
    .filter(list2::contains)
    .distinct()
    .collect(Collectors.toList()); 
Mišo Stankay
  • 339
  • 1
  • 8