0

I'm newbie in C# but I've made some research for reading/writing large txt files, the biggest might be 8GB but if it is too much I will consider split it maybe to 1GB. It must be fast up to 30MBytes/s. I've found three approaches: for sequential operation FileStream or StreamReader/StreamWriter, for random access MemoryMappedFiles. Now I'd like first to read file. Here is an example of code that works:

 FileStream fileStream = new FileStream(@"C:\Users\Guest4\Desktop\data.txt", FileMode.Open, FileAccess.Read);
try
{
    int length = (int)fileStream.Length;  // get file length
    buffer = new byte[length];            // create buffer
    int count;                            // actual number of bytes read
    sum = 0;                          // total number of bytes read

    // read until Read method returns 0 (end of the stream has been reached)
    while ((count = fileStream.Read(buffer, sum, length - sum)) > 0)
        sum += count;  // sum is a buffer offset for next reading
}
finally
{
    fileStream.Close();
}

Do you think is it good way to that fast big files reading?

After reading I need to resend that file. It must be in 16384 bytes chunks. Every chunk will be sent until all the data will be transmitted. And that chunks have to be string type. Could you suggest me how to do it? Split and convert to string. I suppose the best way is to send that string chunk not after reading all file, but if at least that 16384 bytes is read.

John Saunders
  • 160,644
  • 26
  • 247
  • 397
bLAZ
  • 1,689
  • 4
  • 19
  • 31
  • First try `foreach (var line in File.ReadLines("data.txt")) { .... }` If you don't get satisfied you can try other alternatives – L.B Oct 23 '12 at 19:53
  • 1GB chunk size is too big. This might help you. http://stackoverflow.com/questions/8879301/performance-of-copying-a-file-with-fread-fwrite-to-usb/8879583#8879583 – Shiplu Mokaddim Oct 23 '12 at 19:57
  • 1. That 'foreach' loop seems to be ok if it will be fast enough. But I have error: 'System.IO.File' does not contain a definition for 'ReadLine'. I use VS2010 and .NET 4.5. But how to take accurately 16384 bytes? Every line should have that size? I can use StreamReader like: 'while ((line = sr.ReadLine()) != null)' instead of that 'File.ReadLines' but the biggest problem for me is that splitting? 2. I could be misunderstood. Not 1GB chunks. I mean that if I have 4GB of data, I could keep it in four 1GB files and work with first, when first is finished, work with second... – bLAZ Oct 24 '12 at 07:45
  • IO.File does not contain a definition for ReadLine. StreamReader contains ReadLine(). You need to create a reader and attach your file to it. – ROBERT RICHARDSON Feb 10 '14 at 15:57

1 Answers1

3

I've found something like this:

            FileStream FS = new FileStream(@"C:\Users\Guest4\Desktop\data.txt", FileMode.Open, FileAccess.ReadWrite);
            int FSBytes = (int) FS.Length;
            int ChunkSize = 1<<14;    // it's 16384
            byte[] B = new byte[ChunkSize];
            int Pos;

            for (Pos = 0; Pos < (FSBytes - ChunkSize); Pos += ChunkSize)
            {
                FS.Read(B,0 , ChunkSize);
                // do some operation on one chunk

            }

            B = new byte[FSBytes - Pos];
            FS.Read(B,0, FSBytes - Pos);
            // here the last operation procedure on the last chunk
            FS.Close(); FS.Dispose();

It seems to work. I hope only that this part: FS.Read(B,0 , ChunkSize); will be realy fast. If someone has anything to suggest, please do not hesitate.

bLAZ
  • 1,689
  • 4
  • 19
  • 31