How to properly use LINQ Query on List to create Dictionary, int>?

Question

I'm trying to create a LINQ query on a List<KeyValuePair<string, string>> that tells me how many (if any) KeyValue duplicates are in the list. To do this, I tried to create a Dictionary<KeyValuePair<string, string>, int> where the dictionary's key is the KeyValuePair from the list and the dictionary's value is the number of times that pair occurs in the original list. My code compiles, but it currently tells me that each of the KeyValuePairs in the list are duplicated.

To provide some context, this method is called in a .NET/DNN web form to validate a file that is uploaded by one of our clients; the form MAY have duplicate invoice numbers, but may not have duplicate invoice AND part numbers; consequently, the List<KeyValuePair<string, string>> represents a pairing of invoice numbers and part numbers. The dictionary's int value ought to report the number of times each invoice-and-part number pair appears in the master list. The data is being pulled from a GridView control that contains the data from the file upload.

I've worked through a bunch of LINQ articles on here over the past couple hours to try and replicate the code for creating a dictionary with such a list, but I've been unsuccessful. This article provided a lot of useful information:

C# LINQ find duplicates in List

Also, note that this code is being implemented in DNN, which makes it VERY difficult to debug. I'm unable to step-through my code to identify the issue, so please have some patience with me.

private void CheckBranchNumbersFile(GridView Upload)
{
    List<KeyValuePair<string, string>> invoiceAndPart = new List<KeyValuePair<string, string>>();

    for (int i = 0; i < Upload.Rows.Count; ++i)
    {
        KeyValuePair<string, string> pair = new KeyValuePair<string, string>(Upload.Rows[i].Cells[2].ToString(), Upload.Rows[i].Cells[5].ToString());
        invoiceAndPart.Add(pair);
    }

    List<KeyValuePair<string, string>> invoiceAndPartUnsorted = new List<KeyValuePair<string, string>>(invoiceAndPart);

    var query = invoiceAndPart.GroupBy(x => x).Where(g => g.Count() > 1).ToDictionary(x => x.Key, y => y.Count());
    foreach (KeyValuePair<KeyValuePair<string, string>, int> invPartCount in query)
    {
        int count = invPartCount.Value;
        if (count > 1)
        {
            IsNotValid = true;
            for (int i = 0; i < invoiceAndPartUnsorted.Count; ++i)
            {
                if (invoiceAndPartUnsorted[i].Key.Equals(invPartCount.Key.Key) && invoiceAndPartUnsorted[i].Value.Equals(invPartCount.Key.Value))
                {
                    // This block highlights the cells on the review screen for the client to see erroneous data
                    Upload.Rows[i].Cells[2].BackColor = Color.Red;
                    Upload.Rows[i].Cells[5].BackColor = Color.Red;
                    Upload.Rows[i].Cells[2].ToolTip = "Cannot have duplicate invoice AND part numbers";
                    Upload.Rows[i].Cells[5].ToolTip = "Cannot have duplicate invoice AND part numbers";
                }
            }
        }
    }
}

See the following reproducible example:

// Populate list with sample invoice/part numbers, including some duplicates

List<KeyValuePair<string, string>> data = new List<string, string>();
KeyValuePair<string, string> sample1 = new KeyValuePair<string, string>("ABC", "100");
KeyValuePair<string, string> sample2 = new KeyValuePair<string, string>("FFF", "250");
KeyValuePair<string, string> sample3 = new KeyValuePair<string, string>("XYZ", "100");
KeyValuePair<string, string> sample4 = new KeyValuePair<string, string>("ABC", "100");
KeyValuePair<string, string> sample5 = new KeyValuePair<string, string>("ABC", "100");

data.Add(sample1);
data.Add(sample2);
data.Add(sample3);
data.Add(sample4);
data.Add(sample5);

// Create copy of data before data is grouped by LINQ query

List<KeyValuePair<string, string>> data2 = new List<string, string>(data);

// Perform LINQ Query to create Dictionary<KeyValuePair<string, string>, int> that reports number of occurences of each KeyValuePair<string, string> in @variable ata

var query = data.GroupBy(x => x).Where(g => g.Count() > 1).ToDictionary(x => x.Key, y => y.Count());

// Using foreach loop, identify the indices in @variable data2 that contain duplicated entries
foreach (KeyValuePair<KeyValuePair<string, string>, int> pair in query)
{
   int count = pair.Value;

   // This pair represents a duplicate because its value > 1
   if (count > 1)
   {
      // Find the entry in data2 that matches this pair
      for (int i = 0; i < data2.Count; ++i)
      {
         if (data2[i].Equals(pair.Key))
         {
            Console.WriteLine("Match in list data2 found at index: " + i);
         }
      }
   }
}

// The console should write:
// Match in list data2 found at index: 0
// Match in list data2 found at index: 3
// Match in list data3 found at index: 4

// Thank you! :)

I'm expecting the review screen to only mark the cells for rows with duplicate invoice-and-part numbers as errors, but it's marking those cells for each row of data in the file. For example, if there are 10 total rows in the input Excel file, and 3 of them include duplicate invoice AND part numbers, then those cells must be colored red and marked with the tool tip, NOT all 10 rows. Cells 2 and 5 of each row contain invoice number and part number respectively. Here is a screenshot of what it's doing now: https://gyazo.com/a2c8203627fe81f763c48008d0ba9e33

In this example, only the last 3 rows should have cells 2 and 5 highlighted red; the other cells highlighted in previous rows are OK (other validators for empty fields).

EDIT: Included a reproducible example. This is my first post on here so pls critique my etiquette! Thank you!

A [minimal, reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) would be of great help to both you and anyone trying to help solve the problem. Please try pulling this code out of your solution and reworking it to focus as tightly as possible on the problem, in a way that it can be run independently of anything else. — nlawalker, Jun 18 '19 at 16:08
Your code doesn't compile, but making the obvious changes so that it does, I get the output that your comment indicates should result. Now that you have a reproducible example, have you tried pasting it into a new application and running it in the debugger to see where things aren't working as expected? — nlawalker, Jun 18 '19 at 16:34
@nlawalker One of my coworkers is walking me through the LINQ query I wrote, I'll debug it now — jokacherski, Jun 18 '19 at 16:39
@nlawalker Was not expecting it to work! Must be an issue in my last for loop. — jokacherski, Jun 18 '19 at 16:44

score 0 · Accepted Answer · answered Jun 18 '19 at 16:37

This works just fine:

         var data = new List<KeyValuePair<string, string>>
         {
            new KeyValuePair<string, string>("ABC", "100"),
            new KeyValuePair<string, string>("FFF", "250"),
            new KeyValuePair<string, string>("XYZ", "100"),
            new KeyValuePair<string, string>("ABC", "100"),
            new KeyValuePair<string, string>("ABC", "100")
         };

         // Create copy of data before data is grouped by LINQ query

         var data2 = data.ToList();

         // Perform LINQ Query to create Dictionary<KeyValuePair<string, string>, int> that reports number of occurences of each KeyValuePair<string, string> in @variable ata

         var query = data.GroupBy(x => x).Where(g => g.Count() > 1).ToDictionary(x => x.Key, y => y.Count());

         // Using foreach loop, identify the indices in @variable data2 that contain duplicated entries
         foreach (var pair in query)
         {
            int count = pair.Value;

            // This pair represents a duplicate because its value > 1
            if (count > 1)
            {
               // Find the entry in data2 that matches this pair
               for (int i = 0; i < data2.Count; ++i)
               {
                  if (data2[i].Equals(pair.Key))
                  {
                     Console.WriteLine("Match in list data2 found at index: " + i);
                  }
               }
            }
         }

That's very strange, but good to know! I didn't think my logic worked. Cheers. — jokacherski, Jun 18 '19 at 16:43
Couple of minor kinks. Look at "var data2 = data.ToList();" to clone your list and you need to be a little more careful at initialising the right types, but otherwise it's good. — Steve Todd, Jun 18 '19 at 16:50

How to properly use LINQ Query on List to create Dictionary, int>?

1 Answers1