0

I have a collection of elements and some of these elements are duplicating. I need to extract all records but only the first record if the record is one of a duplicate set.

I was able to group the elements and find all elements that have duplicates, but how to remove every first element of a group?

var records = 
             dbContext.Competitors
                       .GroupBy(x => x.Email)
                       .Select(x => new { Properties = x, 
                               Count = x.Key.Count() })
                       .Where(x => x.Count > 1)
                       .ToList();

EDIT: Seems like it's impossible to accomplish this task with EF, because it fails to translate the desired linq expression to SQL. I'll be happy if someone offer different approach.

Santosh Panda
  • 7,235
  • 8
  • 43
  • 56
vortex
  • 1,048
  • 1
  • 10
  • 17
  • Could you try fixing your query so that you do not get duplicate elements into your collection? It's easier to solve problems as the root cause rather than later. – zam664 Oct 21 '14 at 18:52
  • 1
    Possible duplicate of http://stackoverflow.com/questions/998066/linq-distinct-values and http://stackoverflow.com/questions/19406242/select-distinct-using-linq – trnelson Oct 21 '14 at 18:54

2 Answers2

4

To exclude the first record from each email-address group with more than one entry, you could do this:

var records = dbContext.Competitors
              .GroupBy(x => x.Email)
              .SelectMany(x => (x.Count() == 1) ? x : x.OrderBy(t=>t).Skip(1))
              .ToList();
Blorgbeard
  • 101,031
  • 48
  • 228
  • 272
  • Yes, but the condition is to get all records, except the first of a dulicate set. With your solution i get just the first record of each group and i want to exclude it. – vortex Oct 21 '14 at 20:12
  • OK, I misunderstood your question ("I need to extract all records but only the first record if the record is one of a duplicate set"). Try my edited code. – Blorgbeard Oct 21 '14 at 20:38
  • Now i get `NotSupported Exception - The method 'Skip' is only supported for sorted input in LINQ to Entities. The method 'OrderBy' must be called before the method 'Skip'.` – vortex Oct 22 '14 at 16:08
  • Oh, ok; I didn't know about that. I assume you can just add an OrderBy before the skip, then. See edit. – Blorgbeard Oct 22 '14 at 18:45
  • It seems EF fails to translate this expression into SQL. I've found similar issues in StackOverflow. I'll search for another solution. Thanks for your help anyways! +1 from me. – vortex Oct 25 '14 at 15:19
2

This is the logic :

Group by a property > Select every Group > (Possibly) Sort that > Skip first one

This can be turned into some linq code like this :

//use SelectMany to flat the array
var x = list.GroupBy(g => g.Key).Select(grp => grp.Skip(1)).SelectMany(i => i);
Amirhossein Mehrvarzi
  • 18,024
  • 7
  • 45
  • 70