0

i have DataTable which can have semi-duplicate rows. In the picture example two highlighted rows have all the same values but the amounts in 'Amount' columns. I would need to identify those rows and sum the amount $. Data comes from text file and there is no key to uniquely identify rows.

enter image description here

I looked at some answers like this one Best way to remove duplicate entries from a data table but in my case would need to match on not just one column but 10.

Also I tried different LINQ queries but was not successful in getting far.

This is the SQL query which does the job:

SELECT [Date1],[Date2],[Date3]
      ,SUM([Amount1]) as Summary1
      ,SUM([Amount2]) as Summary2
      ,SUM([Amount3]) as Summary3
      ,[col1],[col2],[Rate1],[Rate2],[Rate3],[product],[comment]
  FROM [Table]
  group by [Date1],[Date2],[Date3],[col1],[col2],[Rate1],[Rate2],[Rate3],[product],[comment]

EDIT: Just to clarify, SQL query was an example of how I could get a successful results if I was querying SQL table.

Community
  • 1
  • 1
AidaM
  • 233
  • 3
  • 16
  • You need to do `GroupBy` some features like Date 1, col 1, etc. – AD.Net May 08 '15 at 18:20
  • so what's the problem and or question here @AidaM does the query yield the proper results..? also if you are loading / reading the data from a textfile.. this is actually quite easy to do .. are you familiar with something called `Control Break Processing` google how to use `GroupBy` if you are running from Sql Query – MethodMan May 08 '15 at 18:21
  • I have just tested the query in SQL but my application data nor comes nor goes to SQL. I process input data by saving it to datatable and after processing will be exported as Excel file. I need to do what the SQL query example would do with either LINQ or some other way. @MethodMan – AidaM May 08 '15 at 18:27
  • I think that creating a class that mimics the file layout would be a great start.. then Load all of the text data into a string[] loop through the file checking on the 2 data fields .. if they are the same then capture / calculate the amounts.. when the values change then you have your totals and use the next fields date values as your new value to check in a loop if they are equal to the previous.. that's just one way you can do this.. control break processing.. – MethodMan May 08 '15 at 18:32
  • yes, creating a class file to match input fields would be a way to do it. There are already few corrections ('break') processes that fix other column data before this process. Also the 'file checking' would need to be on 10 column fields not 2 as you mention. The two would have to be summed if others are the same. @MethodMan – AidaM May 08 '15 at 21:06

1 Answers1

0

This is how you would do it with EF:

var result=db.Table
  .GroupBy(r=>new {
      r.Date1,
      r.Date2,
      r.Date3,
      r.col1,
      r.col2,
      r.Rate1,
      r.Rate2,
      r.Rate3,
      r.product,
      r.comment},
  p=>new {
      p.Amount1,
      p.Amount2,
      p.Amount3},
  (key,vals)=>new {
      key.Date1,
      key.Date2,
      key.Date3,
      Amount1=vals.Sum(v=>v.Amount1),
      Amount2=vals.Sum(v=>v.Amount2),
      Amount3=vals.Sum(v=>v.Amount3),
      key.col1,
      key.col2,
      key.Rate1,
      key.Rate2,
      key.Rate3,
      key.product,
      key.comments}
      );

Slight modification if you are actually doing it from a dataTable:

var result=dt.AsEnumerable()
  .GroupBy(d=>new {
      Date1=d.Field<datetime>("Date1"),
      Date2=d.Field<datetime>("Date2"),
      Date3=d.Field<datetime>("Date3"),
      col1=d.Field<string>("col1"),
      col2=d.Field<string>("col2"),
      Rate1=d.Field<decimal>("Rate1"),
      Rate2=d.Field<decimal>("Rate2"),
      Rate3=d.Field<decimal>("Rate3"),
      product=d.Field<string>("product"),
      comments=d.Field<string>("comments")},
  p=>new {
      Amount1=p.Field<decimal>("Amount1"),
      Amount2=p.Field<decimal>("Amount2"),
      Amount3=p.Field<decimal>("Amount3")},
  (key,vals)=>new {
      key.Date1,
      key.Date2,
      key.Date3,
      Amount1=vals.Sum(v=>v.Amount1),
      Amount2=vals.Sum(v=>v.Amount2),
      Amount3=vals.Sum(v=>v.Amount3),
      key.col1,
      key.col2,
      key.Rate1,
      key.Rate2,
      key.Rate3,
      key.product,
      key.comments}
      );

If you need the result as a datatable, you can use one of the many List To Datatable extension methods out there. Just add ".ToList().AsDataTable()" at the end of the above query to get it back into a datatable.

Robert McKee
  • 21,305
  • 1
  • 43
  • 57
  • 1
    There are some example extension methods in this thread: http://stackoverflow.com/questions/9937573/convert-select-new-to-datatable – Robert McKee May 08 '15 at 18:59
  • Thank you for the help, that worked great! I looked into converting it back to DataTable but didn't have to; instead I looped through each result row and generated my final output Excel file. @Robert-McKee – AidaM May 08 '15 at 20:48