0

I have an app that reads a huge text file (about more than 100MB) line by line.

As it takes so much time for the app to read the file, I'd like to add a StatusStripProgressBar at its bottom, indicating how much time remains until the end of loading.

I tried to compare the file length to the length of the strings being read, but I don't get the same result. I also tried to convert the string to bytes, but it still differs, for example:

while (!sr.EndOfStream) 
{
   s = sr.ReadLine;
   TotalStringSize += s.Length;
   UTF8ToASCII += UTF8Encoding.ASCII.GetByteCount(s);
   UTF8ToBigEndianUnicode += UTF8Encoding.BigEndianUnicode.GetByteCount(s);
   UTF8ToDefault += UTF8Encoding.Default.GetByteCount(s);
   UTF8ToUnicode += UTF8Encoding.Unicode.GetByteCount(s);
   UTF8ToUTF32 += UTF8Encoding.UTF32.GetByteCount(s);
   UTF8ToUTF7 += UTF8Encoding.UTF7.GetByteCount(s);
   UTF7ToASCII  = UTF7Encoding.ASCII.GetByteCount(s);
   //
   // ...
   //
}

The results I get are either higher or lower than the result given by System.IO.FileStream.Length. Any idea?

EDIT: The framework used is .NET 2.0

GianT971
  • 4,385
  • 7
  • 34
  • 46
  • Unicode characters are stored in 2 bytes whereas normal characters are stored in only one byte. It all depends on the type of encoding your file is in – GETah Jun 27 '12 at 08:55
  • 1
    Isn't it a possibility to use sr.Position compared to the sr.Length to display the progress? – Me.Name Jun 27 '12 at 08:57
  • Is it necessary to show the (already loaded) contents of the file while loading? – stombeur Jun 27 '12 at 09:00
  • @Me.Name is there such method (sr.Length)? I didn't write it but sr is a streamReader – GianT971 Jun 27 '12 at 09:02
  • @StephaneT No, just the progress of the loading – GianT971 Jun 27 '12 at 09:02
  • Is it an option to use StreamReader.Read(buffer, index, count) in fixed increments that are multiples of 2 and converting the bytes to string as you go? – stombeur Jun 27 '12 at 09:09
  • @GianT971 So it is, I assumed sr was a FileStream based on the text underneath the code example, but seeing sr uses readline, I could have known it was a streamreader. In that case: sr.BaseStream.Position / sr.BaseStream.Length (available depending on the type of stream used) – Me.Name Jun 27 '12 at 09:11
  • @Me.Name yeah, I just saw that baseStream stuff, but as I use sr.ReadLine, sr.BaseStream.Position does not increase, it sticks to 1024. But that is probably a good clue to achieve my goal – GianT971 Jun 27 '12 at 09:20
  • @StephaneT Not really, as the app has to deal with the information brought by each line, line by line – GianT971 Jun 27 '12 at 09:23

1 Answers1

0

The sr.ReadLine strips from the returned string the carriage return/line feed couple of chars.
You need to keep count for these characters missing in the returned string but present in the overall file lenght.

Steve
  • 213,761
  • 22
  • 232
  • 286
  • Nice, I just have to add those 2 characters on each loop and it does the job. Thanks! – GianT971 Jun 27 '12 at 09:29
  • good point. Is it possible that OP also has to compare with FileInfo.Length instead of stream length as stated here: http://stackoverflow.com/questions/5959983/how-to-check-logical-and-physical-file-size-on-disk-using-c-sharp-file-api ? – stombeur Jun 27 '12 at 09:30