1

I have a list of strings like following which I fill and group currently like this:

public static List<CustomDTO> mostCommonKeywords { get; set; }

And the list is sorted like following:

  mostCommonKeywords = key.GroupBy(v2 => v2)
                 .Select(g => new CustomDTO { Key = g.Key, Count = g.Count() })
                 .OrderByDescending(e => e.Count).Distinct()
                 .ToList(); 

Where Key is List of strings like following:

var key = new List<string>();

Each string element inside the key list consists of 3 words which I need to merge into 1 in case they are equal (or group them into one, whichever term you prefer more).

The grouping method like above gives me these results:

Samsung Galaxy S7   
Galaxy S7 edge  
Galaxy S7 Edge  
S7 edge SM  
Samsung Galaxy S7   
Samsung Galaxy S7   

As you can see clearly there are duplicates here in this list of strings, and I need the results to look like this:

Samsung Galaxy S7   
Galaxy S7 edge  
S7 edge SM  

So basically wherever a same string occurs, I need to merge it into one...

What am I doing wrong here??

Edit: And here is how the CustomDTO class looks like:

 public class CustomDTO
    {
        public string Key { get; set; } 
        public int Count { get; set; }

        public List<int> Sales = new List<int>(); 
    }

Edit: The thing here is that I'm adding a sale number into each string which consists of 3 words to know which keyword how many sales....

This is how I've done it:

   for (int i = 0; i < filtered.Count; i++)
                {
                    foreach (var triad in GetAllWords(filtered[i]))
                    {
                        var sequence = triad[0] + " " + triad[1] + " " + triad[2];
                        key.Add(sequence + " " + lista[i].SaleNumber);
                    }
                }

This is the part that makes the string "not unique":

 + lista[i].SaleNumber

Edit:

mostCommonKeywords list is a list of CustomDTO object which consists of:

public string Key { get; set; } 
public int Count { get; set; }
public List<int> Sales = new List<int>(); 

And suppose that at the end of everything the list looks like this:

      Key           Sales
Samsung Galaxy S7    5
Galaxy S7 edge       4
Galaxy S7 Edge       4
S7 edge SM           3 
Samsung Galaxy S7    6
Samsung Galaxy S7    7

How can I now find all these duplicates and sum them so that the list looks like following:

Samsung galaxy S7 18 
Galaxy S7 edge 8 
S7 edge SM 3
User987
  • 3,663
  • 15
  • 54
  • 115

2 Answers2

2

When you group the strings you can pass a IEqualityComparer<> to ignore case:

var keywords = key.GroupBy(v2 => v2, StringComparer.InvariantCultureIgnoreCase)
                  .Select(g => new CustomDTO { Key = g.Key, Count = g.Count() })
                  .OrderByDescending(e => e.Count).Distinct()
                  .ToList();

EDIT:

If items are something like { string Key, int Sale }, you can Sum() the Sale property like this:

var keywords = items.GroupBy(v2 => v2.Key, StringComparer.InvariantCultureIgnoreCase)
                  .Select(g => new CustomDTO
                  {
                      Key = g.Key,
                      Count = g.Count(),
                      Sales = g.Sum(k => k.Sale)
                  })
                  .OrderByDescending(e => e.Count).Distinct()
                  .ToList(); 

Note: CustomDTO.Sales must be int type, not List<int>.

Arturo Menchaca
  • 15,783
  • 1
  • 29
  • 53
  • Arturo, quick question, imagine if I left the duplicate strings like that.. Would there be a way for me to loop through them and find all duplicate strings and sum their sales all at once?? As you can see I have a list of sales for each keyword... I would have to loop through the entire list and find identical ones and sum them, and THEN add the duplicates (i.e. add only 1 duplicate string into new list to ensure I have no duplicates in it ) into the new 3rd list which would contain all the sum of sales... – User987 Nov 01 '16 at 15:44
  • @User987: I don't understand exactly what you ask, please add an example. – Arturo Menchaca Nov 01 '16 at 15:49
  • I've updated my original question ,can you look into it ? – User987 Nov 01 '16 at 15:54
  • Is there any way to make it with Sales being List? – User987 Nov 01 '16 at 16:04
  • If you want the sales values (not sum) you can do `Sales = g.Select(k => k.Sale).ToList()`, but in your output you are showing a single value that is the sum, so `Sales` must be an integer, not list. – Arturo Menchaca Nov 01 '16 at 16:07
1

GroupBy takes a second parameter where you can specify the EqualityComparer.

This should work. You do not need the second Distinct call

var mostCommonKeywords = key.GroupBy(v2 => v2,StringComparer.OrdinalIgnoreCase)
        .Select(g => new CustomDTO { Key = g.Key, Count = g.Count() })
        .OrderByDescending(e => e.Count)
        .ToList();
Shyju
  • 214,206
  • 104
  • 411
  • 497
  • quick question, imagine if I left the duplicate strings like that.. Would there be a way for me to loop through them and find all duplicate strings and sum their sales all at once?? As you can see I have a list of sales for each keyword... I would have to loop through the entire list and find identical ones and sum them, and THEN add the duplicates (i.e. add only 1 duplicate string into new list to ensure I have no duplicates in it ) into the new 3rd list which would contain all the sum of sales... – User987 Nov 01 '16 at 15:45