10

I'm making a simple folder sync backup tool for myself and ran into quite a roadblock using File.Copy. Doing tests copying a folder of ~44,000 small files (Windows mail folders) to another drive in my system, I found that using File.Copy was over 3x slower than using a command line and running xcopy to copy the same files/folders. My C# version takes over 16+ minutes to copy the files, whereas xcopy takes only 5 minutes. I've tried searching for help on this topic, but all I find is people complaining about slow file copying of large files over a network. This is neither a large file problem nor a network copying problem.

I found an interesting article about a better File.Copy replacement, but the code as posted has some errors which causes problems with the stack and I am nowhere near knowledgeable enough to fix the problems in his code.

Are there any common or easy ways to replace File.Copy with something more speedy?

Guavaman
  • 450
  • 1
  • 5
  • 12

5 Answers5

9

One thing to consider is whether your copy has a user interface that updates during the copy. If so, make sure your copy is running on a separate thread, or both your UI will freeze up during the copy, and the copy will be slowed down by making blocking calls to update the UI.

I have written a similar program and in my experience, my code ran faster than a windows explorer copy (not sure about xcopy from the command prompt).

Also if you have a UI, don't update on every file; instead update every X megabytes or every Y files (whichever comes first), this keeps down the amount of updating to something the UI can actually handle. I used every .5MB or 10 files; those may not be optimal but it noticeably increased my copy speed and UI responsiveness.

Another way to speed things up is to use the Enumerate functions instead of Get functions (e.g. EnumerateFiles instead of GetFiles). These functions start returning results as soon as possible instead of waiting to return everything when the list is finished being built. They return an Enumerable, so you can just call foreach on the result: foreach(string file in System.IO.Directory.EnumerateDirectories(path)). For my program this also made a noticeable difference in speed, and would be even more helpful in cases like yours where you are dealing with directories containing many files.

Echilon
  • 10,064
  • 33
  • 131
  • 217
mikeagun
  • 219
  • 1
  • 5
  • I do have my interface on a background thread. Thanks for the tips on updating the display less often. Unfortunately, this isn't really a problem for my copy times. With the UI update completely disabled, the copy time is the same as before. I will see if using EnumerateFiles helps. – Guavaman Jul 07 '12 at 03:30
  • Just got the results of using the IEnumerables. It didn't help the file copying times at all, probably because I'm going folder by folder copying files so it doesn't take terribly long to do the GetFiles most of the time. However it did help a little with making the initial file counting process smoother. – Guavaman Jul 07 '12 at 04:04
  • Oh, as for Windows Explorer copy times... Yeah, I would be curious to know how your code does compared to XCopy. I started trying to copy the folder using Windows Explorer and... um, yeah. It was telling me somewhere between 1 and 5 hours. I'm sure it wouldn't have run that long, but I didn't want to waste the time to find out. So running faster than an explorer copy isn't that tough to achieve. ;) – Guavaman Jul 07 '12 at 04:07
  • Enumerate functions won't speed up the actual copy, so if that is the problem you may need to use another method to do the copies. The suggestions I gave were based on your circumstance of many small files, where minimizing the time between copies can make a difference. If that doesn't help you may have to pursue CopyFileEx. It looks like it was used successfully in this post: http://stackoverflow.com/a/187842/1507945 – mikeagun Jul 07 '12 at 04:09
  • I looks like you sent a couple messages while I was typing mine. It's too bad that it didn't help. My next suggestion would be follow the directions from that other post. Let me know how it goes. If it makes a huge difference I may have to look into adding that to my backup software. – mikeagun Jul 07 '12 at 04:14
  • Interestingly, for my test case of one directory containing 16384 small text files, my program runs faster than xcopy, though it sounds like that may not quite represent your situation so may not be relevant – mikeagun Jul 07 '12 at 04:26
  • Nice result! That's a lot better than I've been able to do. I tried out the CopyFileEx link you posted using the XCopy method, but sadly my result was practically identical. The bottleneck can't be anywhere but the copy itself as I can run through 40k files with the copy command commented out in < 10 seconds (still with logging and UI update). I don't know why it's running this slow, but it looks like that's the the way it is. Thanks for all your suggestions! – Guavaman Jul 07 '12 at 05:56
5

One of the things that slows down IO operations the most on rotational disks is moving the disk head.

It's reasonable to assume and probably quite accurate that your many small files (that all are related to each other) are closer together on the disk than they are close to the destination of the copy (assuming you're copying from one part of a disk to another part of the same disk). If you copy for a bit then write for a bit, you open a window of opportunity for other processes to move the disk head on the source or target disk.

One thing that XCopy does much better than Copy (meaning in both cases the commands) is that XCopy reads in a bunch of files before starting to write out those files to the destination.

If you are copying files on the same disk, try allocating a large buffer to read in many files at once, then write out those files once the buffer is full).

If you are reading from one disk and writing to another disk, try starting up one thread to read from the source disk and a separate thread to write to the other disk.

Eric J.
  • 147,927
  • 63
  • 340
  • 553
  • Thanks for the good info! I was particularly interested in trying buffering the reads/writes like you described XCopy doing. I did some tests with a 50mb buffer and found it got my copy time down to 14 min 40 sec. So not an amazing improvement, but better. Still lightyears behind the XCopy times. I'll see if threading the reads/writes helps next... – Guavaman Jul 07 '12 at 03:28
  • Actually, I'm back to 16 minutes after I realized my FileStream based buffered copy system wasn't copying the file properties (attributes, creation time, etc.) After adding those back in, the copy time is back to where it was with File.Copy, only I've lost 50MB of memory to buffering. :( – Guavaman Jul 07 '12 at 04:41
3

There are two algorithms for faster file copy:

If source and destination are different disks Then:

  • One thread reading files continuously and storing in a buffer.
  • Another thread writing files continuously from that buffer.

If source and destination is same disk then:

  • Read a fixed chunk of bytes, say 8K at a time, no matter how many files that is.
  • Write that fixed chunk to destination, either in one file or in multiple files.

This way you will get significant performance.

Alternative is you just invoke xcopy from your .net code. Why bother doing it using File.Copy. You can capture xcopy output using Process.StandardOutput and show on the screen in order to show user what's going on.

oazabir
  • 1,599
  • 9
  • 15
1

I think you could at least parallize it so that you do two files at the same time. While one thread is writing another can already be reading the next file. If you have a list of the files you can do that like this. Using many threads will not help because this will make the drive move around a lot more instead of being able to write sequentially..

 var files = new List<string>();
 // todo: fill the files list using directoryenumeration or so...
 var po = new ParallelOptions() {MaxDegreeOfParallelism = 2};
 Parallel.ForEach(files, po, CopyAFile);

 // Routine to copy a single file
 private void CopyAFile(string file) { }
IvoTops
  • 3,463
  • 17
  • 18
0

I have no good experience at this level. Why don't you try to run a batch file containing your xcopy comand? Check this post: Executing Batch File in C#

Community
  • 1
  • 1
MMALSELEK
  • 777
  • 6
  • 13