1

What is the easiest way to read a file character by character in C#?

Currently, I am reading line by line by calling System.io.file.ReadLine(). I see that there is a Read() function but it doesn;t return a character...

I would also like to know how to detect the end of a line using such an approach...The input file in question is a CSV file....

user559142
  • 12,279
  • 49
  • 116
  • 179

2 Answers2

5

Open a TextReader (e.g. by File.OpenText - note that File is a static class, so you can't create an instance of it) and repeatedly call Read. That returns int rather than char so it can also indicate end of file:

int readResult = reader.Read();
if (readResult != -1)
{
    char nextChar = (char) readResult;
    // ...
}

Or to loop:

int readResult;
while ((readResult = reader.Read()) != -1)
{
    char nextChar = (char) readResult;
    // ...
}

Or for more funky goodness:

public static IEnumerable<char> ReadCharacters(string filename)
{
    using (var reader = File.OpenText(filename))
    {
        int readResult;
        while ((readResult = reader.Read()) != -1)
        {
            yield return (char) readResult;
        }
    }
}

...

foreach (char c in ReadCharacters("foo.txt"))
{
    ...
}

Note that all by default, File.OpenText will use an encoding of UTF-8. Specify an encoding explicitly if that isn't what you want.

EDIT: To find the end of a line, you'd check whether the character is \n... you'd potentially want to handle \r specially too, if this is a Windows text file.

But if you want each line, why not just call ReadLine? You can always iterate over the characters in the line afterwards...

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • I assume the OP wants to iterate _all_ characters in the file("_read a file character by character_"). – Tim Schmelter Mar 08 '12 at 10:52
  • @TimSchmelter: So they'd need to loop until it returned -1... will edit with more code, but I think this really gives enough information. – Jon Skeet Mar 08 '12 at 10:57
  • how would i detect the end of a line though? – user559142 Mar 08 '12 at 11:56
  • @user559142: You'd have to do that yourself, based on line terminators. Your question doesn't mention anything about lines, other than that you *don't* want `ReadLine`. If you want more specific information, ask a more specific question. – Jon Skeet Mar 08 '12 at 11:57
  • I have added it to my question! – user559142 Mar 08 '12 at 12:03
  • @user559142: Answer duly edited - but if you're really line-oriented, why don't you just read each line and then examine it? – Jon Skeet Mar 08 '12 at 12:09
  • I want don't reduce the effect of limitations that could potentially be exhibited by client workstations - in this case memory. The CSV files could potentially be huge and it seemed more efficient to store token by token as oposed to entire lines and then further reprocessing... – user559142 Mar 09 '12 at 16:06
  • @user559142: Do you really think an *individual line* is going to be of a significant size? Bearing in mind modern memory, are you *really* going to have lines which you think will take more than (say) 1MB of memory? This sounds like a classic example of premature optimization. (Reading 1 line != reading whole file.) – Jon Skeet Mar 09 '12 at 16:07
0

Here is a snippet from msdn

using (StreamReader sr = new StreamReader(path))
{
    char[] c = null;

    while (sr.Peek() >= 0)
    {
        c = new char[1];
        sr.Read(c, 0, c.Length);

        // do something with c[0]
    }
}
Martha
  • 21
  • 3
  • I know it's not your code, but it's a horrible approach IMO. Why do this when `Read()` does exactly what's required? – Jon Skeet Mar 08 '12 at 11:00