I'm trying to create a LINQ query on a List<KeyValuePair<string, string>>
that tells me how many (if any) KeyValue duplicates are in the list. To do this, I tried to create a Dictionary<KeyValuePair<string, string>, int>
where the dictionary's key is the KeyValuePair from the list and the dictionary's value is the number of times that pair occurs in the original list. My code compiles, but it currently tells me that each of the KeyValuePairs in the list are duplicated.
To provide some context, this method is called in a .NET/DNN web form to validate a file that is uploaded by one of our clients; the form MAY have duplicate invoice numbers, but may not have duplicate invoice AND part numbers; consequently, the List<KeyValuePair<string, string>>
represents a pairing of invoice numbers and part numbers. The dictionary's int value ought to report the number of times each invoice-and-part number pair appears in the master list. The data is being pulled from a GridView control that contains the data from the file upload.
I've worked through a bunch of LINQ articles on here over the past couple hours to try and replicate the code for creating a dictionary with such a list, but I've been unsuccessful. This article provided a lot of useful information:
C# LINQ find duplicates in List
Also, note that this code is being implemented in DNN, which makes it VERY difficult to debug. I'm unable to step-through my code to identify the issue, so please have some patience with me.
private void CheckBranchNumbersFile(GridView Upload)
{
List<KeyValuePair<string, string>> invoiceAndPart = new List<KeyValuePair<string, string>>();
for (int i = 0; i < Upload.Rows.Count; ++i)
{
KeyValuePair<string, string> pair = new KeyValuePair<string, string>(Upload.Rows[i].Cells[2].ToString(), Upload.Rows[i].Cells[5].ToString());
invoiceAndPart.Add(pair);
}
List<KeyValuePair<string, string>> invoiceAndPartUnsorted = new List<KeyValuePair<string, string>>(invoiceAndPart);
var query = invoiceAndPart.GroupBy(x => x).Where(g => g.Count() > 1).ToDictionary(x => x.Key, y => y.Count());
foreach (KeyValuePair<KeyValuePair<string, string>, int> invPartCount in query)
{
int count = invPartCount.Value;
if (count > 1)
{
IsNotValid = true;
for (int i = 0; i < invoiceAndPartUnsorted.Count; ++i)
{
if (invoiceAndPartUnsorted[i].Key.Equals(invPartCount.Key.Key) && invoiceAndPartUnsorted[i].Value.Equals(invPartCount.Key.Value))
{
// This block highlights the cells on the review screen for the client to see erroneous data
Upload.Rows[i].Cells[2].BackColor = Color.Red;
Upload.Rows[i].Cells[5].BackColor = Color.Red;
Upload.Rows[i].Cells[2].ToolTip = "Cannot have duplicate invoice AND part numbers";
Upload.Rows[i].Cells[5].ToolTip = "Cannot have duplicate invoice AND part numbers";
}
}
}
}
}
See the following reproducible example:
// Populate list with sample invoice/part numbers, including some duplicates
List<KeyValuePair<string, string>> data = new List<string, string>();
KeyValuePair<string, string> sample1 = new KeyValuePair<string, string>("ABC", "100");
KeyValuePair<string, string> sample2 = new KeyValuePair<string, string>("FFF", "250");
KeyValuePair<string, string> sample3 = new KeyValuePair<string, string>("XYZ", "100");
KeyValuePair<string, string> sample4 = new KeyValuePair<string, string>("ABC", "100");
KeyValuePair<string, string> sample5 = new KeyValuePair<string, string>("ABC", "100");
data.Add(sample1);
data.Add(sample2);
data.Add(sample3);
data.Add(sample4);
data.Add(sample5);
// Create copy of data before data is grouped by LINQ query
List<KeyValuePair<string, string>> data2 = new List<string, string>(data);
// Perform LINQ Query to create Dictionary<KeyValuePair<string, string>, int> that reports number of occurences of each KeyValuePair<string, string> in @variable ata
var query = data.GroupBy(x => x).Where(g => g.Count() > 1).ToDictionary(x => x.Key, y => y.Count());
// Using foreach loop, identify the indices in @variable data2 that contain duplicated entries
foreach (KeyValuePair<KeyValuePair<string, string>, int> pair in query)
{
int count = pair.Value;
// This pair represents a duplicate because its value > 1
if (count > 1)
{
// Find the entry in data2 that matches this pair
for (int i = 0; i < data2.Count; ++i)
{
if (data2[i].Equals(pair.Key))
{
Console.WriteLine("Match in list data2 found at index: " + i);
}
}
}
}
// The console should write:
// Match in list data2 found at index: 0
// Match in list data2 found at index: 3
// Match in list data3 found at index: 4
// Thank you! :)
I'm expecting the review screen to only mark the cells for rows with duplicate invoice-and-part numbers as errors, but it's marking those cells for each row of data in the file. For example, if there are 10 total rows in the input Excel file, and 3 of them include duplicate invoice AND part numbers, then those cells must be colored red and marked with the tool tip, NOT all 10 rows. Cells 2 and 5 of each row contain invoice number and part number respectively. Here is a screenshot of what it's doing now: https://gyazo.com/a2c8203627fe81f763c48008d0ba9e33
In this example, only the last 3 rows should have cells 2 and 5 highlighted red; the other cells highlighted in previous rows are OK (other validators for empty fields).
EDIT: Included a reproducible example. This is my first post on here so pls critique my etiquette! Thank you!