1

How to find duplicate rows from a DataTable in which columns are dynamic. Like for one instance there are 3 columns and next instances 4 columns. Below is code -

case 1 - Data grouped by Color, Material, product_id
case 2- Data grouped by Color, Material, Size, product_id

Case 1 -

var duplicates = (from row in dtImportedData.AsEnumerable()
                 let id = row.Field<string>("product_id")
                 let Color = row.Field<object>("Color")
                 let Material = row.Field<object>("Material")
                 group row by new { id, Color, Material } into grp
                 where grp.Count() > 1
                 select grp).ToList();

Case -2

  var duplicates = (from row in dtImportedData.AsEnumerable()
                   let id = row.Field<string>("product_id")
                   let Color = row.Field<object>("Color")
                   let Material = row.Field<object>("Material")
                   let Size = row.Field<object>("Size")
                   group row by new { id, Color, Material, Size } into grp
                   where grp.Count() > 1
                   select grp).ToList();
  • Try this one: https://stackoverflow.com/questions/15161180/use-linq-to-find-duplicated-rows-with-list-of-specified-columns – Vikas Gupta Apr 18 '19 at 05:44
  • Its upto the Business definition of DUPLICATE. Use the columns that define the product as Unique and match only those. – Prateek Shrivastava Apr 18 '19 at 05:44
  • Since you are using `group`...`by`, each `group` will be unique by the grouping value. If you only want one row from each group, use `grp.First()`. You could also use a LINQ extension method (either from MoreLINQ or your own) `DistinctBy`. – NetMage Apr 18 '19 at 18:01

0 Answers0