0

I'm trying to write a simple program which takes a textfile, sets all characters to lowercase and removes all punctuation. My problem is that when there is a carriage return (I believe that's what it is called) and a new line, the space is removed.

i.e.

This is a
test sentence

Becomes

This is atestsentence

The last word of the first line and the first word of the next line are joined.

This is my code:

public static void ParseDocument(String FilePath, String title)
    {
        StreamReader reader = new StreamReader(FilePath);
        StreamWriter writer = new StreamWriter("C:/Users/Matt/Documents/"+title+".txt");

        int i;
        char previous=' ';
        while ((i = reader.Read())>-1)
        {
            char c = Convert.ToChar(i);
            if (Char.IsLetter(c) | ((c==' ') & reader.Peek()!=' ') | ((c==' ') & (previous!=' ')))
            {
                c = Char.ToLower(c);
                writer.Write(c);                    
            }
            previous = c;

        }

        reader.Close();
        writer.Close();
    }

It's a simple problem, but I can't think of a way of checking for a new line to insert a space. Any help is greatly appreciated.

Mr Lister
  • 45,515
  • 15
  • 108
  • 150
Matt
  • 3,820
  • 16
  • 50
  • 73
  • 1
    You want line breaks to remain intact, is that it? In that case, don't just check for letters; check for carriage returns and line feeds as well. – Mr Lister Apr 16 '12 at 16:35
  • 1
    obligatory remark about using `using()` around Reader and Writer. – H H Apr 16 '12 at 16:40

2 Answers2

2

Depends a little on how you want to treat empty lines but this might work :

 char c = Convert.ToChar(i);

 if (c == '\n')  
    c = ' ';     // pretend \n == ' ' and keep ignoring \r

 if (Char.IsLetter(c) | ((c==' ') & reader.Peek()!=' ') | ((c==' ') & (previous!=' ')))
 {
    ...

I do hope this is an exercise, in normal practice you would read a Text file with System.IO.File.ReadAllLines() or System.IO.File.ReadLines()

H H
  • 263,252
  • 30
  • 330
  • 514
0

Try

myString.Replace(Environment.NewLine, "replacement text")

Replace Line Breaks in a String C#

Community
  • 1
  • 1
elrado
  • 4,960
  • 1
  • 17
  • 15