How do I capture only the duplicate elements in an ArrayList?

Question

I need to get all duplicates in my ArrayList. I don't need to remove duplicates, I need to add them to another ArrayList. Here is an example:

ArrayList<String> var = new ArrayList<>();
var.add("a");
var.add("b");
var.add("b");
var.add("c");

So, as you can see, there are 2 duplicate elements (b, and b). I need to add them to another ArrayList.

The resulting ArrayList in this case should be [b,b]. How can I do this?

To me, this question seems slightly different. The way I read it, the OP would like the example case to return `[b, b]`, whereas that question wants just `[b]`. — Charlie Armstrong, Dec 27 '20 at 21:16

dreamcrash · Answer 1 · 2021-02-21T14:39:51.833

Approach 1:

Something like this is enough:

    for(String s : var )
        if(Collections.frequency(var, s) > 1)
            duplicates.add(s);

and with streams:

var.stream().filter(s -> frequency(var, s) > 1).collect(toList());

a running example:

public static void main(String[] args) {

    List<String> var = Arrays.asList("a", "b", "b", "c");
    List<String> dup = var.stream().filter(s -> Collections.frequency(var, s) > 1).collect(Collectors.toList());         
    System.out.println(dup);
}

Output:

[b, b]

The idea is as follows, go thought the list, and for each element check their frequency on the list, if they appear more than once add to list of duplicates.

Approach 2:

A less cleaner solution, but with a better time complexity is to use a Map of the string per frequency of that string, and then build the duplicate list base on that Map:

List<String> dup =  new ArrayList<>();
Map<String, Long> frequencies =
        var.stream()
           .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

     for (Map.Entry<String, Long> entry : frequencies.entrySet()){
         for(int i = 0; i < entry.getValue() && entry.getValue() > 1; i++)
             dup.add(entry.getKey());
}

Approach 3:

With linear time complexity O(N):

    Set<String> set = new HashSet <>();
    List<String> duplicates = new ArrayList<>();
    Set<String> is_duplicated = new HashSet <>();
    var.forEach(s -> {
        if(set.contains(s)) {
            is_duplicated.add(s);
            duplicates.add(s);
        }
        else
          set.add(s);
    });
    duplicates.addAll(is_duplicated);
    System.out.println(duplicates);

One can take advantage of the Set.add method semantics, namely:

If this set already contains the element, the call leaves the set
     unchanged and returns {@code false}.

To shorten the above code to only:

    Set<String> set = new HashSet <>();
    List<String> duplicates = new ArrayList<>();
    Set<String> to_add = new HashSet<>();
    var.forEach(s -> {
        if(!set.add(s)) {
            to_add.add(s);
            duplicates.add(s);
        }
    });
    duplicates.addAll(to_add);
    System.out.println(duplicates);

Majed Badawi · Accepted Answer · 2020-12-27T21:27:07.637

4

You can use a HashMap to store the number of occurrences of the String at each iteration.

Then, for the elements occuring more than once, add them n times to the new List:

List<String> var = new ArrayList<>();
var.add("a");
var.add("b");
var.add("b");
var.add("c");
        
Map<String, Integer> map = new HashMap<>();
for(String str : var) {
    if(map.containsKey(str))
        map.put(str, map.get(str)+1);
    else
        map.put(str, 1);
}
        
List<String> duplicates = new ArrayList<>();
for (String str : var) {
    int count = map.get(str);
    if(count > 1) {
        duplicates.add(str);
    }
}
        
System.out.println(duplicates);

Output:

[b,b]

edited Dec 27 '20 at 21:27

answered Dec 27 '20 at 21:12

Majed Badawi

27,616
4
25
48

Uh, i need to return ["b", "b"] – FruthyzGang Dec 27 '20 at 21:19
One problem I see here is that maps do not maintain order of elements, probably not an issue for OP.. – Bahij.Mik Dec 27 '20 at 21:25
1

@Bahij.Mik good note, I modified it to maintain order – Majed Badawi Dec 27 '20 at 21:28

score 1 · Answer 3 · answered Dec 27 '20 at 21:11

1

You could loop through the ArrayList and compare the indexOf and lastIndexOf values. If these are the same, then there is only one instance of the Object in the list. Otherwise, it is a duplicate.

answered Dec 27 '20 at 21:11

Adrian Russo

546
4
16

score 1 · Answer 4 · answered Dec 27 '20 at 21:36

Here's my solution to this problem:

ArrayList<String> duplicates = (ArrayList<String>) var.stream()
        .filter((e) ->
                var.stream()
                        .filter(e::equals)
                        .count() > 1
        ).collect(Collectors.toList());

What this does is it creates a stream using the original ArrayList (var) as the source, and it filters it. The predicate for the filter also creates an identical stream for each element in the outer stream, but this time it filters by all elements equivalent to the current element on the outer stream. So let's say in your example case you are on the first "b" element, it looks through the ArrayList and constructs a stream containing only "b"s. Then it counts the number of elements in the stream and asserts that there's more than 1 element left. In other words, it verifies that this element is a duplicate. If that predicate fails, then the element is removed from the stream, so when you collect the stream back to an ArrayList, only duplicate elements are left.

fps · Answer 5 · 2020-12-27T21:59:53.960

You could create a map that groups all elements, then remove entries from that map with groups of size 1 and finally transform the collection of lists to a flat list. Also, I wouldn't use var as an identifier, as it is a reserved type name in newer versions of Java.

In code:

Map<String, List<String>> map = yourInitialList.stream()
    .collect(Collectors.groupingBy(Function.identity()));

map.values().removeIf(list -> list.size() == 1);

List<String> result = map.values().stream()
    .flatMap(list -> list.stream())
    .collect(Collectors.toList());

If you also need to preserve insertion order, you could adjust the creation of the map with an overload of Collectors.groupingBy:

Map<String, List<String>> map = yourInitialList.stream()
    .collect(Collectors.groupingBy(
             Function.identity(),
             LinkedHashMap::new,
             Collectors.toList()));

score 0 · Answer 6 · answered Dec 27 '20 at 22:03

0

The right answer to this question is to use a Set from Collections, by implementing proper equal and hash methods. Set doesn't allow duplicates.

https://docs.oracle.com/javase/tutorial/collections/interfaces/set.html

answered Dec 27 '20 at 22:03

Perimosh

2,304
3
20
38

How do I capture only the duplicate elements in an ArrayList?

6 Answers6