2

So I've got this question.

Write a program that extracts from a text all sentences that contain a particular word. We accept that the sentences are separated from each other by the character "." and the words are separated from one another by a character which is not a letter.

Sample text:

We are living in a yellow submarine. We don't have anything else. Inside the submarine is very tight. So we are drinking all the day. We will move out of it in 5 days.

Sample result:

We are living in a yellow submarine.

We will move out of it in 5 days. 

This my code so far.

public static string Extract(string str, string keyword)
    {

        string[] arr = str.Split('.');
        string answer = string.Empty;

        foreach(string sentence in arr)
        {
            var iter = sentence.GetEnumerator();
            while(iter.MoveNext())
            {
                if(iter.Current.ToString() == keyword)
                    answer += sentence;
            }
        }

        return answer;
    }

Well it does not work. I call it with this code:

string example = "We are living in a yellow submarine. We don't have anything else. Inside the submarine is very tight. So we are drinking all the day. We will move out of it in 5 days.";

string keyword = "in";
string answer = Extract(example, keyword);
Console.WriteLine(answer);

which does not output anything. It's probably the iterator part since I'm not familiar with iterators.

Anyhow, the hint for the question says we should use split and IndexOf methods.

sertsedat
  • 3,490
  • 1
  • 25
  • 45
Mustafa
  • 177
  • 1
  • 2
  • 10

7 Answers7

3

sentence.GetEnumerator() is returning a CharEnumerator, so you're examining each character in each sentence. A single character will never be equal to the string "in", which is why it isn't working. You'll need to look at each word in each sentence and compare with the term you're looking for.

Andrew Whitaker
  • 124,656
  • 32
  • 289
  • 307
2

Try:

public static string Extract(string str, string keyword)
{
    string[] arr = str.Split('.');
    string answer = string.Empty;

    foreach(string sentence in arr)
    {
        //Add any other required punctuation characters for splitting words in the sentence
        string[] words = sentence.Split(new char[] { ' ', ',' });
        if(words.Contains(keyword)
        {
            answer += sentence;
        }
    }

    return answer;
}
mclaassen
  • 5,018
  • 4
  • 30
  • 52
  • 1
    This answer might be more helpful with an explanation of what the cause of the problem was and how you fixed it, instead of just a code dump. – Michelle Jul 09 '14 at 18:20
  • Andrew Whitaker already posted answer with the why, my answer is more of a "here's how to actually do it". – mclaassen Jul 09 '14 at 18:25
  • This won't work if words are not split by spaces. Also, if one sentence has the keyword more then once, it will add it n times to the answer. – David Zhou Jul 09 '14 at 18:33
  • @DavidZhou Yes on the splitting by spaces. You're wrong though about it adding same sentence multiple times. – mclaassen Jul 09 '14 at 18:35
  • My mistake, for some reason I read a `foreach` in the `Contains` – David Zhou Jul 09 '14 at 18:36
  • @mclaassen I did not know that the Contains method works with arrays. This is pretty simple and elegant in my opinion :). Thanks – Mustafa Jul 09 '14 at 18:37
1

Your code goes through each sentence character by character using the iterator. Unless the keyword is a single-character word (e.g. "I" or "a") there will be no match.

One way of solving this is to use LINQ to check if a sentence has the keyword, like this:

foreach(string sentence in arr)
{
    if(sentence.Split(' ').Any(w => w == keyword))
            answer += sentence+". ";
}

Demo on ideone.

Another approach would be using regular expressions to check for matches only on word boundaries. Note that you cannot use a plain Contains method, because doing so results in "false positives" (i.e. finding sentences where the keyword is embedded inside a longer word).

Another thing to note is the use of += for concatenation. This approach is inefficient, because many temporary throw-away objects get created. A better way of achieving the same result is using StringBuilder.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
1
string input = "We are living in a yellow submarine. We don't have anything else. Inside the submarine is very tight. So we are drinking all the day. We will move out of it in 5 days.";
var lookup = input.Split('.')
                .Select(s => s.Split().Select(w => new { w, s }))
                .SelectMany(x => x)
                .ToLookup(x => x.w, x => x.s);

foreach(var sentence  in lookup["in"])
{
    Console.WriteLine(sentence);
}
Sergey Berezovskiy
  • 232,247
  • 41
  • 429
  • 459
EZI
  • 15,209
  • 2
  • 27
  • 33
  • Well this a bit of an overkill. I still did not get to lambdas or Linq (assuming that's what's going on here :) ) – Mustafa Jul 09 '14 at 18:28
1

I would split the input at the periods and followed by searching each sentence for the given word.

string metin = "We are living in a yellow submarine. We don't have anything else. Inside the submarine is very tight. So we are drinking all the day. We will move out of it in 5 days.";
string[] metinDizisi = metin.Split('.');
string answer = string.Empty;

for (int i = 0; i < metinDizisi.Length; i++)
{
    if (metinDizisi[i].Contains(" in "))
    {
        answer += metinDizisi[i];
    }
}

Console.WriteLine(answer);
Valerij Dobler
  • 1,848
  • 15
  • 25
Mr Yuksel
  • 21
  • 1
  • 4
  • 1
    As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community May 24 '22 at 00:31
0

You can use sentence.Contains(keyword) to check if the string has the word you are looking for.

public static string Extract(string str, string keyword)
    {
        string[] arr = str.Split('.');
        string answer = string.Empty;

        foreach(string sentence in arr)
            if(sentence.Contains(keyword))
                answer+=sentence;

        return answer;
    }
David Zhou
  • 142
  • 7
  • This won't work if there is a word that has as a subset of characters that are the keyword. For example if the sentence contained the word "explanation" it would match it as containing the word "an" or "on". – mclaassen Jul 09 '14 at 18:22
  • Exactly like mclaassen said. This matches all the strings in arr because there's no way to distinguish between the word "in" and when it's a subset of another word like "anything" – Mustafa Jul 09 '14 at 18:26
  • right, maybe [this link](http://stackoverflow.com/questions/4131443/c-sharp-find-exact-match-in-string) could help – David Zhou Jul 09 '14 at 18:28
0

You could split on the period to get a collection of sentences, then filter those with a regex containing the keyword.

var results = example.Split('.')
    .Where(s => Regex.IsMatch(s, String.Format(@"\b{0}\b", keyword)));
Mike Hixson
  • 5,071
  • 1
  • 19
  • 24