1

I have read many threads on stackoverflow regarding the fastest way to delete a large number of files. What's the fastest way to delete a large folder in Windows? https://superuser.com/questions/19762/mass-deleting-files-in-windows/289399#289399 etc.

From my own testing, in C# code, the fastest way to delete files over the network seems to be invoking cmd.exe and using del /f/s/q

var before = DateTime.Now;

var cmd_line = "/c del /f/s/q \"" + Path.Combine(dir, num_to_delete.ToString()) + "\"";

var startInfo = new ProcessStartInfo("cmd", cmd_line)
{
    RedirectStandardError = true,
    RedirectStandardOutput = true,
    UseShellExecute = false

};

var process = new Process() { StartInfo = startInfo };
process.Start();
process.WaitForExit();

var stdOut = process.StandardOutput.ReadToEnd();
var stdErr = process.StandardError.ReadToEnd();

var complete = DateTime.Now;

My question is, is there a way to determine what del is doing under the hood? Is there an API in Win32 that I'm missing that I can call directly instead of spinning up a cmd.exe?

My code is in C#, but I wouldn't mind having to call something if this method isn't available from C#.

My testing methodology is as follows:

  • create a folder with a number for the name (5, 10, 20, up to 10240)
  • under 5, create a folder 1, put 1 100 KB file in it
  • under 5, create a folder 2, put 2 100 KB files in it and so on until there are a total of 5 files under the leaf.
  • then repeat the whole process for 10 and so on

this lets me create large amounts of files to dig through and delete a few of.

the first deletion method I tried was basically (this uses a structure like \5\1_0, 2_0 , 2_1) -- i.e. all the files in the same folder

string strFilePath = Path.Combine(dir, num_to_delete + "-*");
string strDirectory = Path.GetDirectoryName(strFilePath);
string strFileName = Path.GetFileName(strFilePath);

DateTime before, after, complete;

if (strDirectory != null && strFileName != null && Directory.Exists(strDirectory))
{
    int number_present = Directory.GetFiles(strDirectory).Length;

    before = DateTime.Now;
    string[] strFiles = Directory.GetFiles(strDirectory, strFileName);
    after = DateTime.Now;
    foreach (string strFile in strFiles)
    {
        File.Delete(strFile);
    }
    complete = DateTime.Now;
}

the second deletion method uses the structure 5\1\, 5\2\, etc. string strDirectory = Path.Combine(dir, num_to_delete.ToString());

DateTime before, after, complete;

int number_present = Directory.GetFiles(dir).Length + Directory.GetDirectories(dir).Length;

before = DateTime.Now;
bool bExists = Directory.Exists(strDirectory);
after = DateTime.Now;
if(bExists)
{
    Directory.Delete(strDirectory, true);
}
complete = DateTime.Now;

the third deletion method uses the same structure, but leaves the file behind: string strDirectory = Path.Combine(dir, num_to_delete.ToString()); string strFilePath = Path.Combine(strDirectory, "*"); string strFileName = Path.GetFileName(strFilePath);

DateTime before, after, complete;
if (Directory.Exists(strDirectory))
{
    int number_present = Directory.GetFiles(strDirectory).Length;

    before = DateTime.Now;
    string[] strFiles = Directory.GetFiles(strDirectory, strFileName);
    after = DateTime.Now;
    foreach (string strFile in strFiles)
    {
        File.Delete(strFile);
    }
    complete = DateTime.Now;

The 4th is the one you see at the top.

My findings are that after 160 file sizes, deleting 5 files at a time in order of speed:

  • 4th (del /f/s/q)
  • 1st (delete the files from one big folder)
  • 2nd (delete the files from a subfolder, but leave the folder)
  • 3rd (delete the files from a subfolder and leave the subfolder)

I ran all the tests 3 times and averaged the results in excel and then charted it. I think I did my numerical analysis correctly.

Community
  • 1
  • 1
David Moore
  • 93
  • 1
  • 12
  • 1
    Why do you think this is fast? How did you measure that? – Thomas Weller May 24 '16 at 20:38
  • 1
    Have you compared this to the much more readable built in methods of the File and Directory classes? – Chris Dunaway May 24 '16 at 20:45
  • 2
    Under the hood, `cmd` uses [DeleteFile](https://msdn.microsoft.com/en-us/library/windows/desktop/aa363915%28v=vs.85%29.aspx). The .NET framework uses [the same method](http://referencesource.microsoft.com/#mscorlib/system/io/file.cs,1167ef90427e1824). Therefore, they both should perform identically. My gut tells me you're not measuring the speed properly. – Icemanind May 24 '16 at 21:07
  • 1
    Wouldn't using a [Stopwatch](https://msdn.microsoft.com/en-us/library/system.diagnostics.stopwatch%28v=vs.110%29.aspx) be more accurate here? – Filburt May 24 '16 at 21:14
  • 1
    Measuring performance also depends on Release build vs. Debug build (http://stackoverflow.com/questions/4043821/performance-differences-between-debug-and-release-builds), running it standalone or running it under the debugger, compiling as 32 bit vs. 64 bit (compared to CMD), ... – Thomas Weller May 24 '16 at 21:15
  • You guys are probably right, I can rerun my tests with stopwatch and see if i get the same results. – David Moore May 24 '16 at 21:15

1 Answers1

3

There is Rohitab API monitor which you can use to find out what CMD does internally. CMD is a 64 bit process on Windows 64 bit, so you want to prefer the 64 bit version.

For the deletion of a local file, CMD uses GetFileType(), GetCurrentDirectory(), GetVolumeInformation(), GetFileAttributes(), GetFullPathName(), FindFirstFileEx(), DeleteFile() and FindClose(). I captured this by checking the API sections Data Access and Storage and Devices.

When deleting items on a network share, you might try Networking/Network Share Management in addition.

I would assume that .NET uses the same functions internally. Why should it do anything in addition?

Thomas Weller
  • 55,411
  • 20
  • 125
  • 222
  • It looks like if you look at the source for Directory.Delete (for example), it doesn't use findfirstfileex in a loop, it recurses. it may be possible to optimize that function a little bit. I'll try out that tool, thanks – David Moore May 24 '16 at 21:03