0

I currently have what I believe is a lambda function with C# (fairly new to coding & haven't used a lambda function before so go easy), which adds duplicate strings (From FilteredList) in a list and counts the number of occurrences and stores that value in count. I only want the most used word from the list which I've managed to do by the "groups.OrderBy()... etc) line, however I'm pretty sure that I've made this very complicated for myself and very inefficient. As well as by adding the dictionary and the key value pairs.

   
            var groups =
        from s in FilteredList
        group s by s into g
        // orderby g descending
        select new
        {
            Stuff = g.Key,
            Count = g.Count()

        };
 groups = groups.OrderBy(g => g.Count).Reverse().Take(1);

var dictionary = groups.ToDictionary(g => g.Stuff, g => g.Count);

foreach (KeyValuePair<string, int> kvp in dictionary)
            {   
                Console.WriteLine("Key = {0}, Value = {1}", kvp.Key, kvp.Value);
                
            }
            

Would someone please either help me through this and explain a little bit of this too me or at least point me in the direction of some learning materials which may help me better understand this.

For extra info: The FilteredList comes from a large piece of external text, read into a List of strings (split by delimiters), minus a list of string stop words.

Also, if this is not a lambda function or I've got any of the info in here incorrect, please kindly correct me so I can fix the question to be more relevant & help me find an answer.

Thanks in advance.

OllieVanD
  • 37
  • 5

3 Answers3

1

This should work:

var mostPopular = groups
    .GroupBy(item => new {item.Stuff, item.Count})
    .Select(g=> g.OrderByDescending(x=> x.Count).FirstOrDefault())
    .ToList();

OrderByDescending along with .First() combines your usage of OrderBy, Reverse() and Take.

Ravi Kiran
  • 565
  • 1
  • 8
  • 22
  • @mjwills Right! Thanks for pointing out. – Ravi Kiran Dec 28 '20 at 12:21
  • This has worked well, to order the list properly, however when I run "foreach ( var i in mostPopular) {Console.WriteLine(i);}", I receive the whole ordered list & not just the top value. Am I missing another step to get that value? Also, I don't suppose you'd know how the then just get the value separated again from the List would you? – OllieVanD Dec 28 '20 at 13:10
1

Yes, I think you have overcomplicated it somewhat.. Assuming your list of words is like:

var words = new[] { "what's", "the", "most", "most", "most", "mentioned", "word", "word" };

You can get the most mentioned word with:

words.GroupBy(w => w).OrderByDescending(g => g.Count()).First().Key;

Of course, you'd probably want to assign it to a variable, and presentationally you might want to break it into multiple lines:

var mostFrequentWord = words
  .GroupBy(w => w)                       //make a list of sublists of words, like a dictionary of word:list<word>
  .OrderByDescending(g => g.Count())     //order by sublist count descending
  .First()                               //take the first list:sublist
  .Key;                                  //take the word 

The GroupBy produces a collection of IGroupings, which is like a Dictionary<string, List<string>>. It maps each word (the key of the dictionary) to a list of all the occurrences of that word. In my example data, the IGrouping with the Key of "most" will be mapped to a List<string> of {"most","most","most"} which has the highest count of elements at 3. If we OrderByDescending the grouping based on the Count() of each of the lists then take the First, we'll get the IGrouping with a Key of "most", so all we need to do to retrieve the actual word is pull the Key out

If the word is just one of the properties of a larger object, then you can .GroupBy(o => o.Word). If you want some other property from the IGrouping such as its first or last then you can take that instead of the Key, but bear in mind that the property you end up taking might be different each time unless you enforce ordering of the list inside the grouping


If you want to make this more efficient than you can install MoreLinq and use MaxBy; getting the Max word By the count of the lists means you can avoid a sort operation. You could also avoid LINQ and use a dictionary:

string[] words = new[] { "what", "is", "the", "most", "most", "most", "mentioned", "word", "word" };

var maxK = "";
var maxV = -1;
var d = new Dictionary<string, int>();
foreach(var w in words){
  if(!d.ContainsKey(w))
    d[w] = 0;
  d[w]++;
  if(d[w] > maxV){
    maxK = w; 
    maxV = d[w];
  }
}
Console.WriteLine(maxK);

This keeps a dictionary that counts words as it goes, and will be more efficient than the LINQ route as it needs only a single pass of the word list, plus the associated dictionary lookups in contrast to "convert wordlist to list of sublists, sort list of sublists by sublist count, take first list item"

Caius Jard
  • 72,509
  • 5
  • 49
  • 80
  • Hi, I appreciate the explanation, is starting to make a lot more sense! However, I'm struggling to get your first suggestion to work. My List is currently name FilteredList and only has strings in it, I've put your code in as shown here: "FilteredList.GroupBy(w => w).OrderByDescending(g => g.Count()).First().Key;" I don't suppose I've made a mistake here have I? I received a suggestion from VS to set everything equal to a string variable, which then displays the entire ordered list (not descending), however I would then just need to get the bottom (top if descending works) entry. Thanks. – OllieVanD Dec 28 '20 at 13:28
  • 1
    Your FilteredList only has strings in it, like my string array only has strings in it? There isn't a mistake in what you've written, if FilteredList truly is a collection of strings.. You simply need to set the result of the operation equal to a variable.. e.g. `var mostFrequentWord = FilteredList.GroupBy(w => w).OrderByDescending(g => g.Count()).First().Key;` – Caius Jard Dec 28 '20 at 13:51
  • That's solved it, thank you! – OllieVanD Dec 28 '20 at 13:57
-1

First part is a Linq operation to read the groups from the FilteredList.

var groups =
    from s in FilteredList
    group s by s into g
    // orderby g descending
    select new
    {
        Stuff = g.Key,
        Count = g.Count()

    };

The Lambda usage starts when the => signal is used. Basically means it's going to be computed at run time and an object of that type/format is to be created. Example on your code:

groups = groups.OrderBy(g => g.Count).Reverse().Take(1);

Reading this, it is going to have an object 'g' that represents the elements on 'groups' with a property 'Count'. Being a list, it allows the 'Reverse' to be applied and the 'Take' to get the first element only.

As for documentation, best to search inside Stack Overflow, please check these links:

Second step: if the data is coming from an external source and there are no performance issues, you can leave the code to refactor onwards. A more detail data analysis needs to be made to ensure another algorithm works.

Nimantha
  • 6,405
  • 6
  • 28
  • 69
Pimenta
  • 1,029
  • 2
  • 13
  • 32