4

I need to get all duplicates in my ArrayList. I don't need to remove duplicates, I need to add them to another ArrayList. Here is an example:

ArrayList<String> var = new ArrayList<>();
var.add("a");
var.add("b");
var.add("b");
var.add("c");

So, as you can see, there are 2 duplicate elements (b, and b). I need to add them to another ArrayList.

The resulting ArrayList in this case should be [b,b]. How can I do this?

dreamcrash
  • 47,137
  • 25
  • 94
  • 117
FruthyzGang
  • 63
  • 1
  • 7

6 Answers6

7

Approach 1:

Something like this is enough:

    for(String s : var )
        if(Collections.frequency(var, s) > 1)
            duplicates.add(s);

and with streams:

var.stream().filter(s -> frequency(var, s) > 1).collect(toList());

a running example:

public static void main(String[] args) {

    List<String> var = Arrays.asList("a", "b", "b", "c");
    List<String> dup = var.stream().filter(s -> Collections.frequency(var, s) > 1).collect(Collectors.toList());         
    System.out.println(dup);
}

Output:

[b, b]

The idea is as follows, go thought the list, and for each element check their frequency on the list, if they appear more than once add to list of duplicates.

Approach 2:

A less cleaner solution, but with a better time complexity is to use a Map of the string per frequency of that string, and then build the duplicate list base on that Map:

List<String> dup =  new ArrayList<>();
Map<String, Long> frequencies =
        var.stream()
           .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

     for (Map.Entry<String, Long> entry : frequencies.entrySet()){
         for(int i = 0; i < entry.getValue() && entry.getValue() > 1; i++)
             dup.add(entry.getKey());
}

Approach 3:

With linear time complexity O(N):

    Set<String> set = new HashSet <>();
    List<String> duplicates = new ArrayList<>();
    Set<String> is_duplicated = new HashSet <>();
    var.forEach(s -> {
        if(set.contains(s)) {
            is_duplicated.add(s);
            duplicates.add(s);
        }
        else
          set.add(s);
    });
    duplicates.addAll(is_duplicated);
    System.out.println(duplicates);

One can take advantage of the Set.add method semantics, namely:

If this set already contains the element, the call leaves the set
     unchanged and returns {@code false}.

To shorten the above code to only:

    Set<String> set = new HashSet <>();
    List<String> duplicates = new ArrayList<>();
    Set<String> to_add = new HashSet<>();
    var.forEach(s -> {
        if(!set.add(s)) {
            to_add.add(s);
            duplicates.add(s);
        }
    });
    duplicates.addAll(to_add);
    System.out.println(duplicates);  
dreamcrash
  • 47,137
  • 25
  • 94
  • 117
4

You can use a HashMap to store the number of occurrences of the String at each iteration.

Then, for the elements occuring more than once, add them n times to the new List:

List<String> var = new ArrayList<>();
var.add("a");
var.add("b");
var.add("b");
var.add("c");
        
Map<String, Integer> map = new HashMap<>();
for(String str : var) {
    if(map.containsKey(str))
        map.put(str, map.get(str)+1);
    else
        map.put(str, 1);
}
        
List<String> duplicates = new ArrayList<>();
for (String str : var) {
    int count = map.get(str);
    if(count > 1) {
        duplicates.add(str);
    }
}
        
System.out.println(duplicates);

Output:

[b,b]
Majed Badawi
  • 27,616
  • 4
  • 25
  • 48
1

You could loop through the ArrayList and compare the indexOf and lastIndexOf values. If these are the same, then there is only one instance of the Object in the list. Otherwise, it is a duplicate.

Adrian Russo
  • 546
  • 4
  • 16
1

Here's my solution to this problem:

ArrayList<String> duplicates = (ArrayList<String>) var.stream()
        .filter((e) ->
                var.stream()
                        .filter(e::equals)
                        .count() > 1
        ).collect(Collectors.toList());

What this does is it creates a stream using the original ArrayList (var) as the source, and it filters it. The predicate for the filter also creates an identical stream for each element in the outer stream, but this time it filters by all elements equivalent to the current element on the outer stream. So let's say in your example case you are on the first "b" element, it looks through the ArrayList and constructs a stream containing only "b"s. Then it counts the number of elements in the stream and asserts that there's more than 1 element left. In other words, it verifies that this element is a duplicate. If that predicate fails, then the element is removed from the stream, so when you collect the stream back to an ArrayList, only duplicate elements are left.

Charlie Armstrong
  • 2,332
  • 3
  • 13
  • 25
1

You could create a map that groups all elements, then remove entries from that map with groups of size 1 and finally transform the collection of lists to a flat list. Also, I wouldn't use var as an identifier, as it is a reserved type name in newer versions of Java.

In code:

Map<String, List<String>> map = yourInitialList.stream()
    .collect(Collectors.groupingBy(Function.identity()));

map.values().removeIf(list -> list.size() == 1);

List<String> result = map.values().stream()
    .flatMap(list -> list.stream())
    .collect(Collectors.toList());

If you also need to preserve insertion order, you could adjust the creation of the map with an overload of Collectors.groupingBy:

Map<String, List<String>> map = yourInitialList.stream()
    .collect(Collectors.groupingBy(
             Function.identity(),
             LinkedHashMap::new,
             Collectors.toList()));
fps
  • 33,623
  • 8
  • 55
  • 110
0

The right answer to this question is to use a Set from Collections, by implementing proper equal and hash methods. Set doesn't allow duplicates.

https://docs.oracle.com/javase/tutorial/collections/interfaces/set.html

Perimosh
  • 2,304
  • 3
  • 20
  • 38