0

i want to remove stop words from my text file and i write the following code for this purpose

 TextWriter tw = new StreamWriter("D:\\output.txt");
 private void button1_Click(object sender, EventArgs e)
        {
            StreamReader reader = new StreamReader("D:\\input1.txt");
            string line;
            while ((line = reader.ReadLine()) != null)
            {
                string[] parts = line.Split(' ');
                string[] stopWord = new string[] { "is", "are", "am","could","will" };
                foreach (string word in stopWord)
                {
                    line = line.Replace(word, "");
                    tw.Write("+"+line);
                }
                tw.Write("\r\n");
            } 

but it doesn't show the result in the output file and the output file remain empty.

Zia Ur Rehman
  • 235
  • 5
  • 16

4 Answers4

6

A regular expression might be perfect for the job:

        Regex replacer = new Regex("\b(?:is|are|am|could|will)\b");
        using (TextWriter writer = new StreamWriter("C:\\output.txt"))
        {
            using (StreamReader reader = new StreamReader("C:\\input.txt"))
            {
                while (!reader.EndOfStream)
                {
                    string line = reader.ReadLine();
                    replacer.Replace(line, "");
                    writer.WriteLine(line);
                }
            }
            writer.Flush();
        }

This method will only replace the words with blanks and do nothing with the stopwords if they are part of another word.

Good luck with your quest.

Casperah
  • 4,504
  • 1
  • 19
  • 13
2

The following works as expected for me. However, it's not a good approach because it will remove the stop words even when they are part of a larger word. Also, it doesn't clean up extra spaces between removed words.

string[] stopWord = new string[] { "is", "are", "am","could","will" };

TextWriter writer = new StreamWriter("C:\\output.txt");
StreamReader reader = new StreamReader("C:\\input.txt");

string line;
while ((line = reader.ReadLine()) != null)
{
    foreach (string word in stopWord)
    {
        line = line.Replace(word, "");
    }
    writer.WriteLine(line);
}
reader.Close();
writer.Close();

Also, I recommend using using statements for when you create your streams in order to ensure the files are closed in a timely manner.

Jonathan Wood
  • 65,341
  • 71
  • 269
  • 466
  • @jonathan: Sir this code is not work properly,, i want to remove stop word from text file – Zia Ur Rehman Mar 14 '13 at 20:02
  • 2
    This is stackoverflow, which is for asking technical questions. The code I posted fixed the errors in your code. To provide additional help to you, I also explained some problems with the approach you are taking. If you have other problems, you might want to post another question. But I strongly recommend you learn to be far more specific than "not working properly", which tells me absolutely nothing about the problem you are having. – Jonathan Wood Mar 14 '13 at 20:15
1

You should wrap your IO objects in using statements so that they are disposed properly.

using (TextWriter tw = new TextWrite("D:\\output.txt"))
{
    using (StreamReader reader = new StreamReader("D:\\input1.txt"))
    {
        string line;
        while ((line = reader.ReadLine()) != null)
        {
            string[] parts = line.Split(' ');
            string[] stopWord = new string[] { "is", "are", "am","could","will" };
            foreach (string word in stopWord)
            {
                line = line.Replace(word, "");
                tw.Write("+"+line);
            }
        }
    }
}
FlyingStreudel
  • 4,434
  • 4
  • 33
  • 55
0

Try wrapping StreamWriter and StreamReader in using() {} clauses.

using (TextWriter tw = new StreamWriter(@"D:\output.txt")
{
  ...
}

You may also want to call tw.Flush() at the very end.

Servy
  • 202,030
  • 26
  • 332
  • 449
Todd Sprang
  • 2,899
  • 2
  • 23
  • 40