0

I currently have a List (searchResults) of strings which are all sentences, which contain the mostPopular string word (mostPopular) in a large piece of text (SentenceList). This all works very well, I'm able to count the number of occurrences each word has in each sentence in the second foreach loop, shown by the d++. However, I'm having trouble then ordering each sentence in searchResults by the most popular word shown by d.

 List<string> searchResults = SentenceList.FindAll(s => s.Contains(mostPopular));
            
            foreach (string i in searchResults) 
            { int d = 0; 
                string[] T = i.Split(" "); 
                foreach (string l in T)
                {
                    if (l.Contains(mostPopular)) { d++; } 
                    else { continue; }
                    
                }
                Console.WriteLine(i + d);
            }
}

Any help would be greatly appreciated, or any recommendations on improving the question, to help me find an answer would be great!

My overall goal is to find the sentence which has the most occurrences of the most popular word, I need it in an ordered list because then I want to select a number of the strings depending on the value typed in by the user.

Many thanks

OllieVanD
  • 37
  • 5
  • Take a look at https://stackoverflow.com/a/15577523/224370 for one way to count occurrences of a word in a string. There are other alternative answers on Stackoverflow too. – Ian Mercer Dec 30 '20 at 00:48
  • Cheers Ian, I've read that however & managed to count the occurrences of the word in each List entry, however I now just need to order the list dependant on that count. – OllieVanD Dec 30 '20 at 00:52

4 Answers4

1

This is very inefficient, as the inner loop is generating the split every time. In any case, don't write a sorting algorithm yourself, use the library functions.

List<(string s, int c)> searchResults = SentenceList
    .Where(s => s.Contains(mostPopular))
    // Find will materialize the list, Where does not
    .Select(s => (s, s.Split(" ").Count(word => word.Contains(mostPopular)))
    // Select tuple of string and count of matches
    .ToList(); // materialize only at the end

searchResults.Sort((a, b) => a.c.CompareTo(b.c));
    //  This is a lambda taking two tuples a and b and comparing the count.
    //  To invert the order, add a - (minus) after the =>

//If you just need to get the top one: (for this you could use IEnumerable and remove ToList above)
(string s, int c) highest = default;
foreach (var tuple in searchResults)
{
    if(tuple.c > highest.c)
        highest = tuple;
}
Charlieface
  • 52,284
  • 6
  • 19
  • 43
  • Thanks for the answer, I was looking for something like this, but can't seem to get the Lamba functions down yet. I'm having trouble compiling the comparing the count line "List.Sort...", it's saying that using the generic type List requires 1 type arguments. I don't suppose you'd be able to help me out again? – OllieVanD Dec 30 '20 at 01:33
  • 1
    Sorry, wrote from memory and mixed it up with Array.Sort. Array.Sort is static, List.Sort is an instance method. Have edited – Charlieface Dec 30 '20 at 01:37
1

You can do it using LINQ as follows:

    string result = a
        .Select(s => (count: s.Split(' ').Count(w => w == mostPopular), sentence: s))
        .OrderByDescending(e => e.count)
        .First()
        .sentence;

By forming a Tuple of the count and the sentence, sort that and then grab top or how many entries you want and the decompose the tuple to get the sentence back.

Ian Mercer
  • 38,490
  • 8
  • 97
  • 133
0

Not sure if this is what you are looking for, but following checks the 'mostPopular' word in list of strings and sorts them highest to lowest.

What I am doing is, create an entry in a new object that holds the string and the number of occurences that the string has of the popular word. Once you have this object, you can use Linq to do the ordering and then printing.

List<string> SentenceList = new List<string>()
{
    "This contains 1 mostPopular",
    "This contains 2 mostPopular mostPopular",
    "This contains 4 mostPopular mostPopular mostPopular mostPopular",
    "This contains 3 mostPopular mostPopular mostPopular"
};
            
var listOfPopular = SentenceList.Select(x => new { str = x, count = x.Split(' ').Where(z => z.Equals("mostPopular")).Count() });
Console.WriteLine(string.Join(Environment.NewLine, listOfPopular.OrderByDescending(x => x.count).Select(x => x.str)));

// Prints
This contains 4 mostPopular mostPopular mostPopular mostPopular
This contains 3 mostPopular mostPopular mostPopular
This contains 2 mostPopular mostPopular
This contains 1 mostPopular
Jawad
  • 11,028
  • 3
  • 24
  • 37
0

There are many ways to do this, however there are a many ways to get it wrong depending on your needs.

Maybe this is a job for regex and word boundaries \b.

The word boundary \b matches positions where one side is a word character (usually a letter, digit or underscore) and the other side is not a word character (for instance, it may be the beginning of the string or a space character

var mostPopular = "bob";

var sentenceList = new List<string>()
{
   "Bob is awesome, we like bob bobbing and stuff",
   "This is just some line",
   "bob is great",
   "I like bobbing"
};

// create a nice compiled regex
var regex = new Regex(@$"\b{mostPopular}\b", RegexOptions.Compiled | RegexOptions.IgnoreCase);

// get a list of sentences that contain your favorite word 
var searchResults = sentenceList.Where(s => regex.Match(s).Success);

// project the count and sentence to a tuple
var results = searchResults
   .Select(x => (Sentence: x, regex.Matches(x).Count))
   .OrderByDescending(x => x.Count); // order by the count

// print the results
foreach (var (sentence, count) in results)
   Console.WriteLine($"{count} : {sentence}");

Output

2 : Bob is awesome, we like bob bobbing and stuff
1 : bob is great

Note, "bob" is only found twice in the first sentence and not at all in the last. Though, this may or may not mater to you.

Full Demo Here

TheGeneral
  • 79,002
  • 9
  • 103
  • 141