0

I am trying to summarize some data in a data table, what I am attempting to do is to sum the total quantity of duplicate rows found.

My data table looks like this.

|ForeName|SurName|Quantity|
|Dave    |Smith  | 10000  |
|Dave    |Smith  | 20000  |
|Dave    |Smith  | 30000  |
|John    |Peacock| 10000  |

I want to summarize this data to look like this.

|ForeName|SurName|Quantity|
|Dave    |Smith  | 60000  |
|John    |Peacock| 10000  |

At the moment I am searching for duplicates in the data table

Dim duplicates = From rows In dt.AsEnumerable().GroupBy(Function(r) New With {Key .a = r("ForeName"), Key .b = r("SurName")}).Where(Function(gr) gr.Count() > 1).ToList()

However where to proceed from here I am unsure, has anyone ever came across a scenario like this and able to point me in the right direction.

Grey Walker
  • 125
  • 16
  • Are you wanting to change the datatable contents or just report on it? – Ňɏssa Pøngjǣrdenlarp Aug 16 '16 at 16:35
  • Hi Plutonix thanks for replying. I just want to report on it, I was thinking maybe I could build a new datatable with the summary data, it was my intent to try and find any duplicates then sum the total of them, but I'm not sure the best way to do this. – Grey Walker Aug 16 '16 at 17:04

2 Answers2

1

Group the rows by name(s) or ID; use the name/ID as the identifier in a new anon type, and a Count/Total or Quantity property to collect the Sum of the Quantity:

Dim duplicates = myDT.AsEnumerable().
                 GroupBy(Function(r) New With {Key .Item = r.Field(Of String)("Name")}).
                 Select(Function(g) New With {Key .Name = g.Key.Item,
                                              Key .Count = g.Count,
                           Key .Total = g.Sum(Function(s) s.Field(Of Int32)("Quantity"))}).
                                      OrderByDescending(Function(j) j.Total).
                                      ToList()

For "fun" it also counts the number of source/group rows and orders them by the total.

enter image description here

Ňɏssa Pøngjǣrdenlarp
  • 38,411
  • 12
  • 59
  • 178
1

If you don't mind that the output is anonymous types instead of DataRows, you could use a LINQ query like this:

Dim summary = 
    From r In dt.AsEnumerable()
    Group By ForeName = r.Field(Of String)("ForeName"), SurName = r.Field(Of String)("SurName") Into Group
    Select ForeName, SurName, Quantity = Group.Sum(Function(x) x.Field(Of Integer)("Quantity"))
ForeName | SurName | Quantity
---------+---------+---------
Dave     | Smith   |    60000
John     | Peacock |    10000

You could use the answers to this question to convert it back into a DataTable, if needed.

Adding the count, like in Plutonix's answer would simply mean adding Group.Count to the Select clause:

Dim summary = 
    From r In dt.AsEnumerable()
    Group By ForeName = r.Field(Of String)("ForeName"), SurName = r.Field(Of String)("SurName") Into Group
    Select ForeName, SurName, Group.Count, Quantity = Group.Sum(Function(x) x.Field(Of Integer)("Quantity"))
ForeName | SurName | Count | Quantity
---------+---------+-------+---------
Dave     | Smith   |     3 |    60000
John     | Peacock |     1 |    10000
Community
  • 1
  • 1
Mark
  • 8,140
  • 1
  • 14
  • 29