9

I'm trying to implement a filecopy method that can match the performance a copy done with the windows explorer.

For exemple a copy (with the windows explorer) from our nas to my computer, performs above 100mb/sec.

My current implementation does the same copy at about 55mb/sec which is already better than the System.IO.File.Copy() which performs at 29mb/sec.

static void Main(string[] args)
    {
        String src = @"";
        String dst = @"";

        Int32 buffersize = 1024 * 1024;
        FileStream input = new FileStream(src, FileMode.Open, FileAccess.Read, FileShare.None, 8, FileOptions.Asynchronous | FileOptions.SequentialScan);
        FileStream output = new FileStream(dst, FileMode.CreateNew, FileAccess.Write, FileShare.None, 8, FileOptions.Asynchronous | FileOptions.SequentialScan);

        Int32 readsize = -1;
        Byte[] readbuffer = new Byte[buffersize];
        IAsyncResult asyncread;
        Byte[] writebuffer = new Byte[buffersize];
        IAsyncResult asyncwrite;

        DateTime Start = DateTime.Now;

        output.SetLength(input.Length);

        readsize = input.Read(readbuffer, 0, readbuffer.Length);
        readbuffer = Interlocked.Exchange(ref writebuffer, readbuffer);

        while (readsize > 0)
        {
            asyncwrite = output.BeginWrite(writebuffer, 0, readsize, null, null);
            asyncread = input.BeginRead(readbuffer, 0, readbuffer.Length, null, null);

            output.EndWrite(asyncwrite);
            readsize = input.EndRead(asyncread);
            readbuffer = Interlocked.Exchange(ref writebuffer, readbuffer);
        }

        DateTime Stop = DateTime.Now;

        TimeSpan Duration = Stop - Start;
        Double speed = input.Length / Duration.TotalSeconds; // bytes/s

        System.Console.WriteLine("MY Speed : " + (speed / 1024 / 1024).ToString() + " mo/sec");

        input.Close();
        output.Close();
        System.IO.File.Delete(dst);
    }

Any idea how to enhance the performance ?

EDIT :

The file is read from a linux-based nas with a 10 Gigabit Ethernet interface with a 60 drives san behind (don't worry about its performances, it works very well) and written to a local raid0 which can write data at about 140MB/sec.

The bottleneck is the destination's gigabit network interface which I'm unable to reach with my current code.

Also, removing the write will not make the read any faster, so I can't go past this 55MB/sec read limit.

EDIT 2 :

The speed issue is related to the fact that the source file is stored on a network share. Only reading from my local drive with my piece of code provides me a 112MB/sec speed.

EDIT 3 :

Samba doesn't seem to be the issue. I replaced the cifs share (samba) with a nfs share on my linux nas and got worse results than with samba on my win7 client.

With nfs, my copy method and the windows explorer had the same performance, around 42MB/sec.

I'm out of ideas...

EDIT 4 :

Just to be sure windows was the issue, I've installed a debian lenny, mounted my nas trough nfs and got 79MB/sec with the same code under mono.

Altar
  • 143
  • 1
  • 6
  • 1
    Windows explorer may also be exploiting the file system cache, and showing a 100 mb/sec transfer, but the kernel may not have written the entire file to disk yet, doing so in the background after the operation completes. How fast can you read the entire file into a MemoryStream? – maxwellb Jul 06 '10 at 11:58
  • Is it really 100 **mbps** or 100 **MBps** (big difference there)? If it is 100MBps it is a pretty fast NAS. Also use `System.Diagnostics.Stopwatch` to measure elapsed time accurately. – Jaroslav Jandek Jul 06 '10 at 12:06
  • 1°) No way. The filesize is 19GB and the unit has only 8GB memory. Also, the destination raid0 can take much more than 100MB/sec, the bottleneck here is the gigabit network interface. 2°) It's a very fast nas :) According to benchmark we have done, it can deliver between 1200 and 1400 mbytes per sec in sequential read. – Altar Jul 06 '10 at 12:38

6 Answers6

5

Try changing the buffer size to equal the sector size on the hard disk - likely 4Kb. Also use the System.Diagnostics.Stopwatch class for timing.

I also wouldn't bother using the async methods in a tight loop - it will incur some overhead going away and allocating a thread from the pool to do the work.

Again also, make use of the using statement for managing the disposal of your streams. Note however that this will skew your timing as you are currently disposing the objects after stopping the timer.

Adam Houldsworth
  • 63,413
  • 11
  • 150
  • 187
  • 1
    Small multiples of sector sizes also work well... Testing anything from 1x to 16x might show a significant gain in one of them. – codekaizen Jul 06 '10 at 11:05
  • Doing only a read synchronously didn't provide any better performance. – Altar Jul 06 '10 at 13:12
  • I'm not sure reading from the stream and writing to another stream asynchronously at the same time is any faster either - you'd be lumbering the I/O with two tasks at the same time. For smaller files, have you tried reading the lot into RAM and then writing? Also, you can write in larger chunks or all at once, so buffering isn't strictly required. – Adam Houldsworth Jul 06 '10 at 13:32
4

Did you try smaller buffer sizes? A buffer size of 1mb is awfully huge and normally buffer sizes of 4-64kb give you best performance.

Also, this may be related to your question: How to write super-fast file-streaming code in C#?

And maybe you can improve performance using memory mapped files: http://weblogs.asp.net/gunnarpeipman/archive/2009/06/21/net-framework-4-0-using-memory-mapped-files.aspx

Community
  • 1
  • 1
Janick Bernet
  • 20,544
  • 2
  • 29
  • 55
  • Using a 64KB buffer size makes the copy go slower about 46MB/sec. – Altar Jul 06 '10 at 12:53
  • So upping it to 2mb, 4mb, or more increases performancefurther? Does performance increase from 64 to 128? Maybe there is a sweespot between 64k and 1mb where performance is optimal? – Janick Bernet Jul 06 '10 at 13:19
  • After 256KB there doesn't seem to be any increase in performance, always around 60MB/sec. – Altar Jul 06 '10 at 15:31
1

There are the usual suspects for increasing speed over a network:

  • Have multiple download threads
  • Pre-allocate the block on the disk where the file will reside

Other than that, you're at the mercy of your hardware limitations.

codekaizen
  • 26,990
  • 7
  • 84
  • 140
  • Multiple download threads would only work if the bottleneck was not the I/O, which is likely is. – Adam Houldsworth Jul 06 '10 at 11:06
  • 1
    @Adam - it could be the network, in which case it would be able to increase perf slightly. As with anything perf-related, though, it's about testing and measuring the various techniques. – codekaizen Jul 06 '10 at 11:11
1

File.Copy() is simply calling the CopyFile() API, you could try p/invoke SHFileOperation() which is what the shell uses - it often seems faster.

Alex K.
  • 171,639
  • 30
  • 264
  • 288
  • This could be an option if I was only copying a file but I must also be able to write the output to multiple destinations. – Altar Jul 06 '10 at 12:29
1

For a deeper understanding of the design options and the tradeoffs involved in file copying, with and without network shares, I'd suggest you take a look at Mark Russinovich's blog post a couple of years ago. There are plenty more wrinkles involved than just the hard disk sector sizes, e.g...

  • The packet size of the SMB protocol
  • How much memory you are willing to use for cacheing (which may slow down other processes)
  • Whether you can sacrifice reliability for speed
  • Whether you want to increase the perceived speed or the actual time-until-finished
  • Where and how much you want to cache at several possible levels
  • Whether you're interested in providing reliable feedback and time estimates
Pontus Gagge
  • 17,166
  • 1
  • 38
  • 51
0

Probably the only faster option would be using an unbuffered IO: ReadFile Function, CreateFile Function, WriteFile Function with the FILE_FLAG_NO_BUFFERING flag with 2-6 MB buffer.

Also this way you would have to align the buffer size with file system sector size, etc.

It would be significantly faster - especially in Windows XP.

btw. I have achieved ~400 MB bandwidth on a striped RAID 0 array this way (using 4MB buffer).

Jaroslav Jandek
  • 9,463
  • 1
  • 28
  • 30
  • How would unbuffered I/O help with NAS storage? Surely network and hard drive access costs would dominate. Eliminating buffering could very well slow down performance instead, when multiple bottlenecks are present. – Pontus Gagge Jul 06 '10 at 11:23
  • I haven't noticed the NAS part. You only eleminate automatic buffering by the system. You still buffer the data but you choose the buffer size. For slow disks, it does not matter much, but for fast raid arrays and Tb networks unbuffered IO performance is very noticeable - not to mention it is easy on CPU. I do not use it if I do not have to though, too much WinAPI handling. Anyway, you can use unbuffered IO with NAS. – Jaroslav Jandek Jul 06 '10 at 12:03
  • 2
    After having finally gone into the rabbit whole (overlapped + unbuffered io), I can confirm this is the only way to achieve true performance. Got 2,25 GiB/sec over a dual 10g link (thanks SMB3) when reading from a scale out NAS. – Altar Nov 18 '16 at 10:54