How to determine duplicates in a collection of ints?

Question

Let's say I've got the following values in a collection of integers:

{1,3,4,5,5,6,7,7}

The result I'd expect is {5,7}.

How can I do that? Maybe using LINQ?

EDIT: The input collection is unsorted, so the algorithm should not rely on duplicates being consecutive. Also, whether the resulting duplicate collection is sorted or not doesn't matter.

if the array is unsorted go for the hashset, else you can just compare each element with the next. — Peter, Jun 27 '11 at 17:59

score 7 · Accepted Answer · answered Jun 27 '11 at 17:57

You can do this with built-in functionality with LINQ, and it works on LINQ providers like LINQ to SQL and EF and NHibernate:

var dups = collection.GroupBy(x => x)
                     .Where(g => g.Count() > 1)
                     .Select(g => g.Key);

score 5 · Answer 2 · answered Jun 27 '11 at 17:56

5

How about:

public static IEnumerable<T> OnlyDupes<T>(this IEnumerable<T> source)
{
    var seen = new HashSet<T>();
    foreach (var item in source)
    {
        // HashSet<T>.Add returns: true if the element is added to the
        // HashSet<T> object; false if the element is already present.
        if (!seen.Add(item)) yield return item;
    }
}

answered Jun 27 '11 at 17:56

user7116

63,008
17
141
172

It doesn't play nice with LINQ providers like LINQ to SQL. – jason Jun 27 '11 at 17:58
@Jason: good to know. We don't use L2S in our shop so I'm not terribly familiar with its limitations. – user7116 Jun 27 '11 at 18:00
That's fine. It won't play nice with ANY LINQ providers that work on a database; it will execute in memory instead of on the database. – jason Jun 27 '11 at 18:02
@Jason: obvious once pointed out :) Based on some other comments I'm now wondering if he wants them in order found or even "back to back dupes only". – user7116 Jun 27 '11 at 18:03

score 2 · Answer 3 · answered Jun 27 '11 at 17:56

2

var list = new List<int>() { 1, 3, 4, 5, 5, 6, 7, 7 };
var duplicates =  list.GroupBy(x => x)
                      .Where(g => g.Count() > 1)
                      .Select(g => g.Key)
                      .ToList();

answered Jun 27 '11 at 17:56

BrokenGlass

158,293
28
286
335

score 2 · Answer 4 · answered Jun 27 '11 at 17:57

Try something like this:

int[] listOfItems = new[] { 4, 2, 3, 1, 6, 4, 3 };
var duplicates = listOfItems
    .GroupBy(i => i)
    .Where(g => g.Count() > 1)
    .Select(g => g.Key);
foreach (var d in duplicates)
    Console.WriteLine(d);

from Finding Duplicates Using LINQ

score 2 · Answer 5 · answered Jun 27 '11 at 17:59

2

var list = new List<int>(){1,3,4,5,5,6,7,7};

var query = ( from i in list
               group i by i into g
               where g.Count() > 1
               select g.Key).ToList();

answered Jun 27 '11 at 17:59

ingo

5,469
1
24
19

1

The `Distinct` is superfluous. – user7116 Jun 27 '11 at 18:01

score 1 · Answer 6 · answered Jun 27 '11 at 17:55

1

Use HashSet<T> if you are on .NET 3.5, or Iesi.Collections (NHibernate uses this)

answered Jun 27 '11 at 17:55

Denis Biondic

7,943
5
48
79

It doesn't play nice with LINQ providers like LINQ to SQL. – jason Jun 27 '11 at 17:59

score 1 · Answer 7 · edited May 23 '17 at 11:55

1

Linq with group by having count will show you how to do a LINQ group by having count > 1

In your specific instance:

var x = from i in new int[] { 1, 2, 2, 3 }
    group i by i into grp
    where grp.Count() > 1
    select grp.Key;

edited May 23 '17 at 11:55

Community

1
1

answered Jun 27 '11 at 17:56

The Evil Greebo

7,013
3
28
55

How to determine duplicates in a collection of ints?

7 Answers7