2

I have a list of objects, each with 2 relevant properties: "ID" and "Name". Lets call the list "lstOutcomes". I need to check the list for duplicates (meaning object1.ID = object2.ID, etc.) and set a flag (valid = false, or something) if there is at least one duplicate. Also, it would be nice to send a message to the user mentioning the "Name" of the object, when it fails.

I am sure I will need to use the Group By operator to do this, but I am not used to doing that in LINQ, and the examples out there are just not helping me. This article seems to be close to what i need, but not quite and it's in C#.

Here is a starting stab at it...

Dim duplist = _
    (From o As objectType In lstOutcomes _
    Group o By o.ID Into g = Group _
    Let dups = g.Where(Function(h) g.Count > 1) _
    Order By dups Descending).ToArray

if duplist.count > 0 then
valid = false
end if

help?

Community
  • 1
  • 1
Watki02
  • 4,696
  • 7
  • 34
  • 36

5 Answers5

2

I'll write it in C#, but hope you could convert it to VB. It does not use join and is O(n log n), and I assumed you have List<T>:

lst.Sort();  //O(nlogn) part.

var duplicatedItems = lst.Skip(1).Where((x,index)=>x.ID == lst[index].ID);
TonySalimi
  • 8,257
  • 4
  • 33
  • 62
Saeed Amiri
  • 22,252
  • 5
  • 45
  • 83
  • The Sort method will only work if the class implements IComparable. – Meta-Knight Oct 12 '11 at 12:51
  • Also "duplicatedItems" will not contain the first duplicate item in the list, only the next ones which have the same ID (although that might be OK). – Meta-Knight Oct 12 '11 at 12:53
  • @Meta-Knight, Yes, but I'd assume first number is valid and any other same number after that is invalid, and question just asked for checking duplicated item exists or not. if I can use indexed `Any` I'll use it. – Saeed Amiri Oct 12 '11 at 14:10
1

It is late, but though it could help others.

You can achieve this with a pair of very clean one-liners:

Dim lstOutcomes As IList(Of T)

Dim FoundDuplicates As Boolean
FoundDuplicates = lstOutcomes.Any(Function(p) lstOutcomes.ToArray.Count(Function(q) p.ID = q.ID and p.Name=q.Name) > 1)

Dim ListOfDuplicates As IList(Of T)
ListOfDuplicates = lstOutcomes.Where(Function(p) lstOutcomes.ToArray.Count(Function(q) p.ID = q.ID And p.Name = q.Name) > 1)

Then you can clean the list of duplicates so that it contains the duplicate only once:

Dim CleanList as IList(of T)
For Each MyDuplicate As T in ListOfDuplicates
    If not CleanList.Any(function(p) p.ID = MyDuplicate.ID And p.Name = MyDuplicate.Name) then
        CleanList.Add(MyDuplicate)
    End If
Next

Or as a one-liner, although it does not read as nicely

ListOfDuplicates.ForEach(sub(p) If not CleanList.Any(function(q) p.ID = q.ID And p.Name = q.Name) then CleanList.Add(p))

Finally, as an anticipation of future requirements, you should define "what a duplicate is" as a separate thing. A delegate is quite convenient for this:

Dim AreDuplicates as Func(of T, T, Boolean) = Function(a,b) a.ID = b.ID And a.Name = b.Name
FoundDuplicates = lstOutcomes.Any(Function(p) lstOutcomes.ToArray.Count(Function(q) AreDuplicates(p,q) ) > 1)
Ama
  • 1,373
  • 10
  • 24
1
Dim itemsGroupedByID = lstOutcomes.GroupBy(Function(x) x.ID)
Dim duplicateItems = itemsGroupedByID.Where(Function(x) x.Count > 1) _
                                     .SelectMany(Function(x) x) _
                                     .ToList()

If duplicateItems.Count > 0
    valid = False
    Dim errorMessage = "The following items have a duplicate ID: " & _
                       String.Join(", ", duplicateItems.Select(Function(x) x.Name))
End If
Meta-Knight
  • 17,626
  • 1
  • 48
  • 58
  • you can't do item.Valid = False in foreach loop – Saeed Amiri Oct 11 '11 at 20:51
  • Why can't I do this? The goal is to set a Valid flag on the object to false, assuming such a flag exists. – Meta-Knight Oct 11 '11 at 21:01
  • In C#, you can't update a value of iterator I think it should be a same in VB, try it, should throw an exception. – Saeed Amiri Oct 11 '11 at 21:09
  • 1
    This would only be a problem if I deleted an item from a list being iterated, setting a property shouldn't cause any problem. This code has been tested with LinqPad. – Meta-Knight Oct 11 '11 at 21:14
  • Anyway, I changed the code to use a local "valid" variable instead. – Meta-Knight Oct 11 '11 at 21:57
  • Thank you, I believe this is the right direction, but it is throwing the following error: `Overload resolution failed because no accessible 'Where' accepts this number of arguments`. – Watki02 Oct 12 '11 at 15:23
  • That's strange, a Where clause with one argument should be accessible. Could you double-check to make sure you have the same code? I have tested this code in VB 2010 and there is no error. – Meta-Knight Oct 12 '11 at 16:02
  • I'm sorry, I skipped a step. It was actually throwing a runtime error: `Public member 'Where' on type 'GroupedEnumerable(Of objectType,Integer,objectType)' not found.`, so I added `As Queryable` to the end of `itemsGroupedByID` to attempt to alleviate that, then that error above happened. – Watki02 Oct 12 '11 at 16:34
  • also, your `string.join` stuff isn't working... I gave up on that. – Watki02 Oct 12 '11 at 16:40
  • The code should work as-is. Are you sure that you're calling the Linq GroupBy method? Try with: `lstOutcomes.AsEnumerable.GroupBy(Function(x) x.ID)` – Meta-Knight Oct 12 '11 at 16:46
1

I'll take back what Saeed Amiri said in C# and complete it.

        lst.Sort()
        Dim valid As Boolean = true
    dim duplicatedItems = lst.Skip(1) _
        .Where(Function(x,index) x.ID = lst(index).ID)

    Dim count As Integer = duplicatedItems.Count()
    For Each item As objectType In duplicatedItems
        valid = False
        Console.WriteLine("id: " & item.ID & "Name: " & item.Name)
    Next
GianT971
  • 4,385
  • 7
  • 34
  • 46
  • Ok, I edited the post. But I don't really understand why it is necessary – GianT971 Oct 12 '11 at 09:13
  • 1
    This algorithm relies on the fact that items with same ID follow each other, therefore it is necessary to sort by ID. Same comment as for Saeed's code, the Sort method won't work if IComparable is not implemented. Using `OrderBy(Function(x) x.ID)` would be simpler. – Meta-Knight Oct 12 '11 at 13:37
1

The project is behind, I just hacked it together like this:

    ' For each outcome, if it is in the list of valid outcomes more than once, and it is not in the list of 
    ' duplicates, add it to the duplicates list.
    Dim lstDuplicates As New List(Of objectType)
    For Each outcome As objectType In lstOutcomes
        'declare a stable outcome variable
        Dim loutcome As objectType = outcome
        If lstOutcomes.Where(Function(o) o.ID = loutcome.ID).Count > 1 _
        AndAlso Not lstDuplicates.Where(Function(d) d.ID = loutcome.ID).Count > 0 Then
            lstDuplicates.Add(outcome)
        End If
    Next
    If lstDuplicates.Count > 0 Then
        valid = False
        sbErrors.Append("There cannot be multiple outcomes of any kind. The following " & lstDuplicates.Count & _
                        " outcomes are duplicates: ")
        For Each dup As objectType In lstDuplicates
            sbErrors.Append("""" & dup.Name & """" & " ")
        Next
        sbErrors.Append("." & vbNewLine)
    End If
Watki02
  • 4,696
  • 7
  • 34
  • 36