Identify duplicates in a List

Question

I have a List of type Integer eg:

[1, 1, 2, 3, 3, 3]

I would like a method to return all the duplicates eg:

[1, 3]

What is the best way to do this?

Is the input list guaranteed to be sorted (as in your example)? — NPE, Sep 14 '11 at 10:25
sort the list, then walk it, keeping the current and prior values. if current == prior you have a duplicate. — mcfinnigan, Sep 14 '11 at 10:26

score 213 · Accepted Answer · edited Mar 23 '22 at 10:29

213

The method add of Set returns a boolean whether a value already exists (true if it does not exist, false if it already exists, see Set documentation).

So just iterate through all the values:

public Set<Integer> findDuplicates(List<Integer> listContainingDuplicates) { 
    final Set<Integer> setToReturn = new HashSet<>(); 
    final Set<Integer> set1 = new HashSet<>();
         
    for (Integer yourInt : listContainingDuplicates) {
        if (!set1.add(yourInt)) {
            setToReturn.add(yourInt);
        }
    }
    return setToReturn;
}

edited Mar 23 '22 at 10:29

catch23

17,519
42
144
217

answered Sep 14 '11 at 10:25

leifg

8,668
13
53
79

2

Why do you have setToReturn? Can you not just use set1.add(yourInt) and return set1? – Phil Sep 14 '11 at 13:59
No you can't because set1 contains all values (including the ones that are not duplicates) – leifg Sep 14 '11 at 14:04
The documentation says add() "Adds the specified element to this set if it is not already present," which makes it sound like when add returns false, nothing was actually added. – Phil Sep 14 '11 at 14:11
3

yes exactly. But when an elemnt is only present once in the specified list, the element is added as well. Look at the example in the question: My solution will return [1,3] as the number 2 is inserted in set1 but not in setToReturn. Your solution would return [1,2,3] (which is not the requirement) – leifg Sep 14 '11 at 14:19
1

I suggest that you use `for (Integer yourInt`, to avoid unnecessary boxing and unboxing, especially since your input already contains `Integer`s. – Hosam Aly Sep 21 '11 at 08:00
@leifg I think it actually returns `[1,3,3]` because there is no check like `if (!set1.add(yourInt) && !setToReturn.contains(yourInt))`. – dargmuesli Jun 07 '16 at 18:10
1

@JonasThelemann the whole idea of a set is that it can NOT contain duplicates. therefore: no matter how often you will add 3 it will always end up only once. – leifg Jun 08 '16 at 11:58
If you have so much input data that you want to use this optimal solution (instead of sorting the input) then you'll also want to pre-allocate the size of the HashSet() objects. Without the preallocation, you don't get O(1) insert time when inserting into the HashSet(). This is because the internal hash array gets repeatedly resized. The inserts then average to something like O(log N) time. This means processing all N items becomes O(N log N) when it could have been O(N). – johnstosh Aug 16 '17 at 19:24
@johnstosh resizing is not as expensive as it was in older versions. Also, this can happen at most 26 times, but isn’t applied to all elements but only those that are already in the set, so after adding 2³⁰ elements, only less than 0,0000015% of all elements actually witnessed 26 resize operations. Like with appending to an `ArrayList`, the amortized cost of adding N elements still is `O(N)`. Of course, specifying in initial capacity still makes the operation faster, even if it’s not changing the time complexity. – Holger Jan 25 '18 at 11:11
@Holger You have an interesting way of doing math. The last resizing operation will affect half the elements. So that's a bit more than what one might think. But also totally unnecessary work. So if you have 100 or 1000 elements, then probably not a big deal, but for 1M elements probably something that should be on your radar, right? – johnstosh Jan 25 '18 at 23:03
@johnstosh it’s not my way of doing the math; it’s even within the documentation of `ArrayList`. I already agreed that this is unnecessary work and providing an initial capacity is more efficient. Still, the time complexity doesn’t change, as constant factors are not considered in the Big O notation. That notation doesn’t tell you whether the operation is slow or fast, only how it *scales*. – Holger Jan 26 '18 at 07:32
2

By the way, in case of `HashSet` you also have to consider the load factor, e.g. when you specify an initial capacity of `100`, because you want to add that number of elements, it gets rounded to the next power of 2 (`128`), which implies that with the default load factor of `0.75f`, the resize threshold will be `96`, so there will be a resize before you have added `100` elements. Thankfully, resizing is not that expensive anymore. With up to date JREs, resizing is not rehashing anymore, the elements just get distributed between their two possible result locations based on the relevant bit. – Holger Jan 26 '18 at 07:37

score 63 · Answer 2 · edited Dec 08 '18 at 12:43

63

I needed a solution to this as well. I used leifg's solution and made it generic.

private <T> Set<T> findDuplicates(Collection<T> collection) {

    Set<T> duplicates = new LinkedHashSet<>();
    Set<T> uniques = new HashSet<>();

    for(T t : collection) {
        if(!uniques.add(t)) {
            duplicates.add(t);
        }
    }

    return duplicates;
}

edited Dec 08 '18 at 12:43

Zarremgregarrok

484
5
8

answered May 09 '13 at 17:39

John Strickler

25,151
4
52
68

2

I know this is 3 years later, but why a LinkedHashedSet, i.e. why do you care about the order? – Ahmad Ragab Feb 18 '16 at 22:39
4

@AhmadRagab you're right, LinkedHashSet isn't required unless you care about the order the duplicates were found (which I think I did at the time) – John Strickler Feb 19 '16 at 14:12
Thanks for following up! – Ahmad Ragab Feb 20 '16 at 16:38
I think that this is one of the code snippets that I lookup more often :) – Giulio Pulina Feb 02 '21 at 13:25
It is worth mentioning that because this is a generic solution it relies on the `equals` and `hashcode` implementation from the `T` class in order to decide whether this is a duplicate item or not. – Flavio Oliva Feb 15 '22 at 16:56

score 62 · Answer 3 · edited Nov 24 '17 at 16:10

62

I took John Strickler's solution and remade it to use the streams API introduced in JDK8:

private <T> Set<T> findDuplicates(Collection<T> collection) {
    Set<T> uniques = new HashSet<>();
    return collection.stream()
        .filter(e -> !uniques.add(e))
        .collect(Collectors.toSet());
}

edited Nov 24 '17 at 16:10

Jens Nyman

1,186
2
14
16

answered Jul 26 '15 at 19:35

Sebastian

1,932
1
16
15

11

this is kind of hard to read, no? You're doing a side-effect in stream operation, making it kind of hard to reason about. But that's just me thinking functional style. It is concise and probably the shortest way though ;). – froginvasion May 04 '17 at 09:10
@froginvasion the built in ```distinct()``` method is also stateful. Can't think of an efficient (O(n)) distinct operation which isn't stateful. – wilmol Sep 29 '19 at 21:55

score 45 · Answer 4 · edited Sep 12 '18 at 13:33

45

java 8 base solution:

List duplicates =    
list.stream().collect(Collectors.groupingBy(Function.identity()))
    .entrySet()
    .stream()
    .filter(e -> e.getValue().size() > 1)
    .map(Map.Entry::getKey)
    .collect(Collectors.toList());

edited Sep 12 '18 at 13:33

Nathan Hughes

94,330
19
181
276

answered Sep 12 '18 at 13:21

بلال المصمودي

461
4
3

2

input list is tranformed to a map, (groupment by same values). then values of map with unique value are "deleted" then map using the key, then the list of list is transformed to a List – بلال المصمودي Sep 12 '18 at 14:05
1

beautiful and fast solution, directly modifiable into filtering on specific getters of item – arberg Feb 05 '19 at 12:22
2

how about using counting? `stream() .collect(Collectors.groupingBy(Function.identity(), Collectors.counting())) .entrySet().stream().filter(e -> e.getValue() > 1) .map(Map.Entry::getKey).collect(Collectors.toList())` – Rezero Dec 23 '20 at 08:18
@Rezero your answer is best one - I suggest you standalone answer rather than comment – michaldo Sep 16 '22 at 12:56

snowman · Answer 5 · 2019-03-08T18:14:13.920

Here is a solution using Streams with Java 8

// lets assume the original list is filled with {1,1,2,3,6,3,8,7}
List<String> original = new ArrayList<>();
List<String> result = new ArrayList<>();

You just look if the frequency of this object is more than once in your list. Then call .distinct() to only have unique elements in your result

result = original.stream()
    .filter(e -> Collections.frequency(original, e) > 1)
    .distinct()
    .collect(Collectors.toList());
// returns {1,3}
// returns only numbers which occur more than once

result = original.stream()
    .filter(e -> Collections.frequency(original, e) == 1)
    .collect(Collectors.toList());
// returns {2,6,8,7}
// returns numbers which occur only once

result = original.stream()
    .distinct()
    .collect(Collectors.toList());
// returns {1,2,3,6,8,7}
// returns the list without duplicates

This is good in terms of readability, but it's **REALLY** bad for performance. `Collections::frequency` is O(n). It needs to go through the whole collection to find the frequency of an item. And we're calling this once for every item in the collection, which makes these snippets `O(n^2)`. You'll notice the difference in any collection of more than a handful of elements. I'd never use this in actual code. — Pawan, Jul 09 '20 at 11:28

Ashkan Aryan · Answer 6 · 2011-09-23T11:02:32.543

int[] nums =  new int[] {1, 1, 2, 3, 3, 3};
Arrays.sort(nums);
for (int i = 0; i < nums.length-1; i++) {

    if (nums[i] == nums[i+1]) {
        System.out.println("duplicate item "+nums[i+1]+" at Location"+(i+1) );
    }

}

Obviously you can do whatever you want with them (i.e. put in a Set to get a unique list of duplicate values) instead of printing... This also has the benefit of recording the location of duplicate items too.

score 8 · Answer 7 · answered Oct 30 '15 at 17:01

Using Guava on Java 8

private Set<Integer> findDuplicates(List<Integer> input) {
    // Linked* preserves insertion order so the returned Sets iteration order is somewhat like the original list
    LinkedHashMultiset<Integer> duplicates = LinkedHashMultiset.create(input);

    // Remove all entries with a count of 1
    duplicates.entrySet().removeIf(entry -> entry.getCount() == 1);

    return duplicates.elementSet();
}

Adriaan Koster · Answer 8 · 2016-07-15T08:26:43.933

7

This also works:

public static Set<Integer> findDuplicates(List<Integer> input) {
    List<Integer> copy = new ArrayList<Integer>(input);
    for (Integer value : new HashSet<Integer>(input)) {
        copy.remove(value);
    }
    return new HashSet<Integer>(copy);
}

edited Jul 15 '16 at 08:26

answered Sep 14 '11 at 11:24

Adriaan Koster

15,870
5
45
60

It works, but it is rather slow since calling remove() on an array list is a linear search. – johnstosh Aug 16 '17 at 18:56
true. Note that if your input contains many duplicates the performance hit is less than if there are only a few of them. – Adriaan Koster Aug 18 '17 at 08:12

score 6 · Answer 9 · answered Sep 14 '11 at 10:28

6

You can use something like this:

List<Integer> newList = new ArrayList<Integer>();
for(int i : yourOldList)
{
    yourOldList.remove(i);
    if(yourOldList.contains(i) && !newList.contains(i)) newList.add(i);
}

answered Sep 14 '11 at 10:28

Eng.Fouad

115,165
71
313
417

2

Using List here is very ineffective – Alexander Farber Sep 20 '11 at 17:20
2

And don't get me started on using `int` as variable type here. It means that for every single iteration, an Integer is unboxed once and an int is boxed four times! – Sean Patrick Floyd Sep 21 '11 at 07:59
1

I think you can easily get a ConcurrentModificationException when trying to remove an element from the list while iterating over it – Jad B. Dec 16 '14 at 14:36
1

this is 100% ConcurrentModificationException since you iterate over a list and you remove elements on the fly. – theo231022 Mar 11 '16 at 07:47

score 5 · Answer 10 · answered Sep 27 '17 at 13:16

5

Lambas might be a solution

Integer[] nums =  new Integer[] {1, 1, 2, 3, 3, 3};
List<Integer> list = Arrays.asList(nums);

List<Integer> dps = list.stream().distinct().filter(entry -> Collections.frequency(list, entry) > 1).collect(Collectors.toList());

answered Sep 27 '17 at 13:16

APA

164
1
5

1

Works, but it runs Collections.frequency for each entry and is thus slow. – arberg Feb 05 '19 at 12:18

score 4 · Answer 11 · answered Sep 29 '19 at 22:06

Similar to some answers here, but if you want to find duplicates based on some property:

  public static <T, R> Set<R> findDuplicates(Collection<? extends T> collection, Function<? super T, ? extends R> mapper) {
    Set<R> uniques = new HashSet<>();
    return collection.stream()
        .map(mapper)
        .filter(e -> !uniques.add(e))
        .collect(toSet());
  }

score 4 · Answer 12 · answered Sep 14 '11 at 10:23

4

Use a MultiMap to store each value as a key / value set. Then iterate through the keys and find the ones with multiple values.

answered Sep 14 '11 at 10:23

John B

32,493
6
77
98

Donald Raab · Answer 13 · 2018-12-10T04:48:04.297

If you use Eclipse Collections, this will work:

MutableList<Integer> list = Lists.mutable.with(1, 1, 2, 3, 3, 3);
Set<Integer> dupes = list.toBag().selectByOccurrences(i -> i > 1).toSet();
Assert.assertEquals(Sets.mutable.with(1, 3), dupes);

Update: As of Eclipse Collections 9.2 you can now use selectDuplicates

MutableList<Integer> list = Lists.mutable.with(1, 1, 2, 3, 3, 3);
Set<Integer> dupes = list.toBag().selectDuplicates().toSet();
Assert.assertEquals(Sets.mutable.with(1, 3), dupes);

You can also use primitive collections to accomplish this:

IntList list = IntLists.mutable.with(1, 1, 2, 3, 3, 3);
IntSet dupes = list.toBag().selectDuplicates().toSet();
Assert.assertEquals(IntSets.mutable.with(1, 3), dupes);

Note: I am a committer for Eclipse Collections.

score 3 · Answer 14 · edited Aug 08 '23 at 09:11

And version which uses commons-collections CollectionUtils.getCardinalityMap method:

final List<Integer> values = Arrays.asList(1, 1, 2, 3, 3, 3);
final Map<Integer, Integer> cardinalityMap = CollectionUtils.getCardinalityMap(values);
System.out.println(cardinalityMap
            .entrySet()
            .stream().filter(e -> e.getValue() > 1)
            .map(e -> e.getKey())
            .collect(Collectors.toList()));

Baisakha Chauhan · Answer 15 · 2018-03-13T06:46:22.493

2

public class practicese {
       public static void main(String[] args) {   

           List<Integer> listOf = new ArrayList<Integer>();
           listOf.add(3);
           listOf.add(1);
           listOf.add(2);
           listOf.add(3);
           listOf.add(3);
           listOf.add(2);
           listOf.add(1);

           List<Integer> tempList = new ArrayList<Integer>();
           for(Integer obj:listOf){
                if(!tempList.contains(obj)){
                    tempList.add(obj);

                }
            }
            System.out.println(tempList);

    }

}

edited Mar 13 '18 at 06:46

answered Mar 13 '18 at 06:41

Baisakha Chauhan

31
5

I like this answer, just needed to add an else to save the duplicate in another list. thanks – Tiago Machado Jun 16 '20 at 11:00

score 1 · Answer 16 · answered Jul 14 '16 at 14:00

1

Compact generified version of the top answer, also added empty check and preallocated Set size:

public static final <T> Set<T> findDuplicates(final List<T> listWhichMayHaveDuplicates) {
    final Set<T> duplicates = new HashSet<>();
    final int listSize = listWhichMayHaveDuplicates.size();
    if (listSize > 0) {
      final Set<T> tempSet = new HashSet<>(listSize);
      for (final T element : listWhichMayHaveDuplicates) {
        if (!tempSet.add(element)) {
          duplicates.add(element);
        }
      }
    }
    return duplicates;
  }

answered Jul 14 '16 at 14:00

Christophe Roussy

16,299
4
85
85

Do you need the zero check? Will new HashSet<>(0) return the sensible empty set? – johnstosh Aug 16 '17 at 19:10
@johnstosh this code could be simplified, but checking for zero allows to only init the `tempSet` with `listSize` when necessary. This is a minor optimization but I like it. – Christophe Roussy Aug 17 '17 at 09:52

score 1 · Answer 17 · answered Jun 03 '19 at 22:34

I took Sebastian's answer and added a keyExtractor to it -

    private <U, T> Set<T> findDuplicates(Collection<T> collection, Function<? super T,? extends U> keyExtractor) {
        Map<U, T> uniques = new HashMap<>(); // maps unique keys to corresponding values
        return collection.stream()
            .filter(e -> uniques.put(keyExtractor.apply(e), e) != null)
            .collect(Collectors.toSet());
    }

score 1 · Answer 18 · answered Aug 20 '19 at 09:09

A thread-safe alternative is this:

/**
 * Returns all duplicates that are in the list as a new {@link Set} thread-safe.
 * <p>
 * Usually the Set will contain only the last duplicate, however the decision
 * what elements are equal depends on the implementation of the {@link List}. An
 * exotic implementation of {@link List} might decide two elements are "equal",
 * in this case multiple duplicates might be returned.
 * 
 * @param <X>  The type of element to compare.
 * @param list The list that contains the elements, never <code>null</code>.
 * @return A set of all duplicates in the list. Returns only the last duplicate.
 */
public <X extends Object> Set<X> findDuplicates(List<X> list) {
    Set<X> dups = new LinkedHashSet<>(list.size());
    synchronized (list) {
        for (X x : list) {
            if (list.indexOf(x) != list.lastIndexOf(x)) {
                dups.add(x);
            }
        }
    }
    return dups;
}

score 1 · Answer 19 · answered Jan 18 '22 at 18:07

List.of(1, 1, 3, 4, 5, 5, 6).stream()
   .collect(Collectors.collectingAndThen
                 (Collectors.groupingBy(Function.identity()),
                   map -> map.entrySet()
                             .stream()
                             .filter(e -> e.getValue().size() > 1)
                             .map(Map.Entry::getKey)
                             .collect(Collectors.toList())));

amit · Answer 20 · 2011-09-14T10:32:31.583

create a Map<Integer,Integer>, iterate the list, if an element is in the map, increase it's value, otherwise add it to the map with key=1
iterate the map, and add to the lists all elements with key>=2

public static void main(String[] args) {
        List<Integer> list = new LinkedList<Integer>();
        list.add(1);
        list.add(1);
        list.add(1);
        list.add(2);
        list.add(3);
        list.add(3);
        Map<Integer,Integer> map = new HashMap<Integer, Integer>();
        for (Integer x : list) { 
            Integer val = map.get(x);
            if (val == null) { 
                map.put(x,1);
            } else {
                map.remove(x);
                map.put(x,val+1);
            }
        }
        List<Integer> result = new LinkedList<Integer>();
        for (Entry<Integer, Integer> entry : map.entrySet()) {
            if (entry.getValue() > 1) {
                result.add(entry.getKey());
            }
        }
        for (Integer x : result) { 
            System.out.println(x);
        }

    }

This is pretty good. It is the best solution if you need to know how many duplicates there were. Some notes: (1) you don't need to call remove() before doing put(). (2) You can set up the LinkedList from an array rather than using repeated add() calls. (3) When val != null, you could immediately add x to result. The result could be a set or a list depending upon whether you wanted to keep count of the number of duplicates. — johnstosh, Aug 16 '17 at 19:08

score 1 · Answer 21 · answered Sep 14 '11 at 10:36

This should work for sorted and unsorted.

public void testFindDuplicates() {

    List<Integer> list = new ArrayList<Integer>();
    list.add(1);
    list.add(1);
    list.add(2);
    list.add(3);
    list.add(3);
    list.add(3);

    Set<Integer> result = new HashSet<Integer>();
    int currentIndex = 0;
    for (Integer i : list) {
        if (!result.contains(i) && list.subList(currentIndex + 1, list.size()).contains(i)) {
            result.add(i);
        }
        currentIndex++;
    }
    assertEquals(2, result.size());
    assertTrue(result.contains(1));
    assertTrue(result.contains(3));
}

Calling contains() on the ArrayList's subList is expensive because it is a linear search. So this is okay for 10 items, but not for 10 million. — johnstosh, Aug 16 '17 at 19:14

score 0 · Answer 22 · edited Feb 10 '17 at 07:17

public class DuplicatesWithOutCollection {

    public static void main(String[] args) {

        int[] arr = new int[] { 2, 3, 4, 6, 6, 8, 10, 10, 10, 11, 12, 12 };

        boolean flag = false;
        int k = 1;
        while (k == 1) {

            arr = removeDuplicate(arr);
            flag = checkDuplicate(arr, flag);
            if (flag) {
                k = 1;
            } else {
                k = 0;
            }

        }

    }

    private static boolean checkDuplicate(int[] arr, boolean flag) {
        int i = 0;

        while (i < arr.length - 1) {

            if (arr[i] == arr[i + 1]) {

                flag = true;

            } else {
                flag = false;
            }
            i++;

        }

        return flag;
    }

    private static int[] removeDuplicate(int[] arr) {

        int i = 0, j = 0;
        int[] temp = new int[arr.length];
        while (i < arr.length - 1) {

            if (arr[i] == arr[i + 1]) {

                temp[j] = arr[i + 1];
                i = i + 2;

            } else {

                temp[j] = arr[i];
                i = i + 1;

                if (i == arr.length - 1) {
                    temp[j + 1] = arr[i + 1];
                    break;
                }

            }
            j++;

        }
        System.out.println();
        return temp;
    }

}

Implementing without using Collection classes.But need little improvement in loop.Volunteering help is appreciable.Output for above look like --> 2 3 4 6 8 10 11 12 — Samrat Roy, Feb 10 '17 at 06:43
To make this execute in a smaller amount of time, you need to use a hash based data structure to keep track of duplicates. That is why you see the other solutions using HashSet() -- it is built into Java. — johnstosh, Aug 16 '17 at 19:03
@johnstosh Yes i know about that but i was thinking to make it without using Collections thats why i have mentioned in my comment.As you see i have commented way before Feb 2017,[there are techniques where you dont have to use collections at all with lesser time complexity]http://www.geeksforgeeks.org/find-duplicates-in-on-time-and-constant-extra-space/ .I have tried that program without knowing the DS & Algo practices.You dont have to downvote me for that..anyways Thanks. — Samrat Roy, Aug 17 '17 at 06:21

score 0 · Answer 23 · answered Nov 17 '17 at 22:25

import java.util.Scanner;

public class OnlyDuplicates {
    public static void main(String[] args) {
        System.out.print(" Enter a set of 10 numbers: ");
        int[] numbers = new int[10];
        Scanner input = new Scanner(System.in);
        for (int i = 0; i < numbers.length; i++) {
            numbers[i] = input.nextInt();
        }
        numbers = onlyDuplicates(numbers);
        System.out.print(" The numbers are: ");
        for (int i = 0; i < numbers.length; i++) {
            System.out.print(numbers[i] + "");
        }
    }

    public static int[] onlyDuplicates(int[] list) {
        boolean flag = true;
        int[] array = new int[0];
        array = add2Array(array, list[0]);
        for (int i = 0; i < list.length; i++) {
            for (int j = 0; j < array.length; j++) {
                if (list[i] == array[j]) {
                    flag = false;
                    break;
                }
            }
            if (flag) {
                array = add2Array(array, list[i]);
            }
            flag = true;
        }
        return array;
    }
    // Copy numbers1 to numbers2
    // If the length of numbers2 is less then numbers2, return false
    public static boolean copyArray(int[] source, int[] dest) {
        if (source.length > dest.length) {
            return false;
        }

        for (int i = 0; i < source.length; i++) {
            dest[i] = source[i];
        }
        return true;
    }
    // Increase array size by one and add integer to the end of the array
    public static int[] add2Array(int[] source, int data) {
        int[] dest = new int[source.length + 1];
        copyArray(source, dest);
        dest[source.length] = data;
        return dest;
    }
}

what do i need to change for this to return the duplicates? – Brenden Nov 17 '17 at 22:26 — Brenden, Nov 17 '17 at 22:26
This should be asked as a new question. – willaien Nov 17 '17 at 22:52 — willaien, Nov 17 '17 at 22:52

score 0 · Answer 24 · answered Jun 26 '18 at 16:06

This would be a good method to find Duplicate values, without using Set.

public static <T> List<T> findDuplicates(List<T> list){

List<T> nonDistinctElements = new ArrayList<>();

  for(T s : list)
    if(list.indexOf(s) != list.lastIndexOf(s))
      if(!nonDistinctElements.contains(s))
        nonDistinctElements.add(s);

  return nonDistinctElements;
}

And say, that you want a method that returns you a distinct list, i.e. if you pass a list where elements are occurring more than once, you'll get a list with distinct elements.

public static <T> void distinctList(List<T> list){

List<T> nonDistinctElements = new ArrayList<>();
for(T s : list)
  if(list.indexOf(s) != list.lastIndexOf(s))
    nonDistinctElements.add(s);

for(T nonDistinctElement : nonDistinctElements)
  if(list.indexOf(nonDistinctElement) != list.lastIndexOf(nonDistinctElement))
    list.remove(nonDistinctElement);
}

score 0 · Answer 25 · answered Oct 05 '18 at 18:23

How about this code -

public static void main(String[] args) {

    //Lets say we have a elements in array
    int[] a = {13,65,13,67,88,65,88,23,65,88,92};

    List<Integer> ls1 = new ArrayList<>();
    List<Integer> ls2 = new ArrayList<>();
    Set<Integer> ls3 = new TreeSet<>();

    //Adding each element of the array in the list      
    for(int i=0;i<a.length;i++) {
     {
    ls1.add(a[i]);
    }
    }

    //Iterating each element in the arrary
    for (Integer eachInt : ls1) {

    //If the list2 contains the iterating element, then add that into set<> (as this would be a duplicate element)
        if(ls2.contains(eachInt)) {
            ls3.add(eachInt);
        }
        else {ls2.add(eachInt);}

    }

    System.out.println("Elements in array or ls1"+ls1); 
    System.out.println("Duplicate Elements in Set ls3"+ls3);


}

score 0 · Answer 26 · answered Oct 24 '18 at 15:22

just in case for those that also want to include both the duplicate and the non duplicates. basically the answer similiar to the correct answer but instead of returning from if not part you return the else part

use this code (change to the type that you need)

public Set<String> findDup(List<String> Duplicates){
    Set<String> returning = new HashSet<>();
    Set<String> nonreturning = new HashSet<>();
    Set<String> setup = new HashSet<>();
    for(String i:Duplicates){
        if(!setup.add( i )){
            returning.add( i );
        }else{
            nonreturning.add( i );
        }
    }
    Toast.makeText( context,"hello set"+returning+nonreturning+" size"+nonreturning.size(),Toast.LENGTH_SHORT ).show();
    return nonreturning;
}

score 0 · Answer 27 · answered Aug 28 '19 at 16:20

More generic method as variant of https://stackoverflow.com/a/52296246

    /**
     * Returns a duplicated values found in given collection based on fieldClassifier
     *
     * @param collection given collection of elements
     * @param fieldClassifier field classifier which specifies element to check for duplicates(useful in complex objects).
     * @param <T> Type of element in collection
     * @param <K> Element which will be returned from method in fieldClassifier.
     * @return returns list of values that are duplocated.
     */
    public static <T, K> List<K> lookForDuplicates(List<T> collection, Function<? super T, ? extends K> fieldClassifier) {

        return collection.stream().collect(Collectors.groupingBy(fieldClassifier))
                         .entrySet()
                         .stream()
                         .filter(e -> e.getValue().size() > 1)
                         .map(Map.Entry::getKey)
                         .collect(Collectors.toList());
    }

score 0 · Answer 28 · answered Sep 14 '11 at 10:24

Try this to find duplicates items in list :

ArrayList<String> arrayList1 = new ArrayList<String>(); 

arrayList1.add("A"); 
arrayList1.add("A"); 
arrayList1.add("B"); 
arrayList1.add("B"); 
arrayList1.add("B"); 
arrayList1.add("C"); 

for (int x=0; x< arrayList1.size(); x++) 
{ 
System.out.println("arrayList1 :"+arrayList1.get(x)); 
} 
Set s=new TreeSet(); 
s.addAll(arrayList1); 
Iterator it=s.iterator(); 
while (it.hasNext()) 
{ 
System.out.println("Set :"+(String)it.next()); 
}

Yes, this finds the set, but it doesn't find the list or set of those items which are duplicates. — johnstosh, Aug 16 '17 at 19:12

score 0 · Answer 29 · answered Sep 14 '11 at 18:10

This is a problem where functional techniques shine. For example, the following F# solution is both clearer and less bug prone than the best imperative Java solution (and I work daily with both Java and F#).

[1;1;2;3;3;3] 
|> Seq.countBy id 
|> Seq.choose (fun (key,count) -> if count > 1 then Some(key) else None)

Of course, this question is about Java. So my suggestion is to adopt a library which brings functional features to Java. For example, it could be solved using my own library as follows (and there are several others out there worth looking at too):

Seq.of(1,1,2,3,3,3)
.groupBy(new Func1<Integer,Integer>() {
    public Integer call(Integer key) {
        return key;
    }
}).filter(new Predicate<Grouping<Integer,Integer>>() {
   public Boolean call(Grouping<Integer, Integer> grouping) {
        return grouping.getGrouping().count() > 1;
   }
}).map(new Func1<Grouping<Integer,Integer>,Integer>() {
    public Integer call(Grouping<Integer, Integer> grouping) {
        return grouping.getKey();
    }
});

And then you see how much Java actually still sucks at functional programming. Such a simple problem, so hard to express what you want in Java. — froginvasion, May 04 '17 at 09:12

score 0 · Answer 30 · answered Nov 30 '22 at 20:24

So, this is how I solved it. It may be a bit more overhead, but returns exactly what OP wanted:

 public static List<Something> findDuplicatesInList(List<Something> somethingList) {

    List<Something> temp = somethingList
            .stream()
            .filter(alpha -> somethingList
                             .stream()
                             .filter(beta -> beta.equals(alpha))
                             .count() > 1)
            .collect(Collectors.toList());

    List<Something> duplicateSomethings = new ArrayList<>();

    for(Something something: temp){
        if(!duplicateSomethings.contains(something)){
            duplicateSomethings.add(something);
        }
    }

    return duplicateSomethings;
}

score -1 · Answer 31 · answered Jul 21 '17 at 07:11

-1

Just try this :

Example if List values are: [1, 2, 3, 4, 5, 6, 4, 3, 7, 8] duplicate item [3, 4].

Collections.sort(list);
        List<Integer> dup = new ArrayList<>();
        for (int i = 0; i < list.size() - 1; i++) {
            if (list.get(i) == list.get(i + 1)) {
                if (!dup.contains(list.get(i + 1))) {
                    dup.add(list.get(i + 1));
                }
            }
        }
        System.out.println("duplicate item " + dup);

answered Jul 21 '17 at 07:11

User Learning

3,165
5
30
51

Invoking contains() on an ArrayList() is an expensive operation, so you should consider using a Set instead. You'll see other solutions using HashSet() or linked versions of it. – johnstosh Aug 16 '17 at 19:18

score -1 · Answer 32 · answered Sep 20 '11 at 18:50

If you know the maximum value (for example < 10000) you could sacrifice space for speed . I Can’t remember exact name of this technique.

pseudo code:

//does not handle case when mem allocation fails 
//probably can be extended to unknown values /larger values .
maybe by sorting first
public List<int> GetDuplicates(int max)
{   
    //allocate and clear memory to 0/false
    bit[] buckets=new bit[max]
    memcpy(buckets,0,max);
    //find duplicates
    List<int> result=new List<int>();
    foreach(int val in List)
    {
        if (buckets[val])
        {
            result.add(value);
        }
        else
        {
            buckets[val]=1;
        }
    }
    return  result
}

I think you want "boolean" instead of "bit"? Did you execute your code before posting it? This is a good start. If you take a look at HashSet() then you'll see that it is the 'buckets' implementation that you desire. — johnstosh, Aug 16 '17 at 19:15

score -3 · Answer 33 · answered Sep 14 '11 at 10:25

-3

Put list in set (this effectively filter only unique items), remove all set items from original list (so it will contains only items, which have more then 1 occurence), and put list in new set (this will again filter out only unique items):

List<Item> list = ...;
list.removeAll(new HashSet<Item>(list));
return new HashSet<Item>(list);

answered Sep 14 '11 at 10:25

BegemoT

3,776
1
24
30

+1: If you use a LinkedHashSet it will retain the original order as well. i.e. `[1, 3]` instead of `[3, 1]` – Peter Lawrey Sep 14 '11 at 11:00
11

This does not work. removeAll removes _all_ occurrences so also the duplicates. – Adriaan Koster Sep 14 '11 at 11:21

Identify duplicates in a List

33 Answers33

Linked

Related