60

I have a task to clean up a large number of directories. I want to start at a directory and delete any sub-directories (no matter how deep) that contain no files (files will never be deleted, only directories). The starting directory will then be deleted if it contains no files or subdirectories. I was hoping someone could point me to some existing code for this rather than having to reinvent the wheel. I will be doing this using C#.

rudolph1024
  • 962
  • 1
  • 12
  • 32
Jay
  • 1,286
  • 2
  • 11
  • 20

9 Answers9

117

Using C# Code.

static void Main(string[] args)
{
    processDirectory(@"c:\temp");
}

private static void processDirectory(string startLocation)
{
    foreach (var directory in Directory.GetDirectories(startLocation))
    {
        processDirectory(directory);
        if (Directory.GetFiles(directory).Length == 0 && 
            Directory.GetDirectories(directory).Length == 0)
        {
            Directory.Delete(directory, false);
        }
    }
}
puretppc
  • 3,232
  • 8
  • 38
  • 65
Ragoczy
  • 2,907
  • 1
  • 19
  • 17
  • 20
    A quicker way of writing your `if` statement could be `if (Directory.GetFileSystemEntries(directory).Length == 0)` – Jesse C. Slicer May 11 '10 at 15:36
  • 18
    (!Directory.EnumerateFileSystemEntries(directory).Any())is better – prime23 Sep 29 '12 at 05:17
  • this method worked for me but upon further reflection I had to do a double take at the fact that you're calling the method name inside of the method with that third line: `processDirectory(directory)`! Huh?? – user1985189 Apr 29 '13 at 18:09
  • 6
    @user1985189: This is called a recursive function. see: http://en.wikipedia.org/wiki/Recursive_function – Jean-François Côté May 22 '13 at 14:40
  • @Jean-FrançoisCôté cool thanks for the answer! Can't believe I was never exposed to this in three courses worth of Java – user1985189 May 22 '13 at 22:41
  • @JesseC.Slicer That works too. But I'm not sure which one is the best for speed. – puretppc Jan 26 '14 at 21:52
  • 1
    The Get* calls enumerates all entries before returning, The Enumerate calls yields items, so checking for Any makes it exit earlier (do less) work than collecting all entries - this could make a huge difference on directories with lots of items – NiKiZe Feb 25 '19 at 14:54
55

If you can target the .NET 4.0 you can use the new methods on the Directory class to enumerate the directories in order to not pay a performance penalty in listing every file in a directory when you just want to know if there is at least one.

The methods are:

  • Directory.EnumerateDirectories
  • Directory.EnumerateFiles
  • Directory.EnumerateFileSystemEntries

A possible implementation using recursion:

static void Main(string[] args)
{
    DeleteEmptyDirs("Start");
}

static void DeleteEmptyDirs(string dir)
{
    if (String.IsNullOrEmpty(dir))
        throw new ArgumentException(
            "Starting directory is a null reference or an empty string", 
            "dir");

    try
    {
        foreach (var d in Directory.EnumerateDirectories(dir))
        {
            DeleteEmptyDirs(d);
        }

        var entries = Directory.EnumerateFileSystemEntries(dir);

        if (!entries.Any())
        {
            try
            {
                Directory.Delete(dir);
            }
            catch (UnauthorizedAccessException) { }
            catch (DirectoryNotFoundException) { }
        }
    }
    catch (UnauthorizedAccessException) { }
}

You also mention that the directory tree could be very deep so it's possible you might get some exceptions if the path you are probing are too long.

João Angelo
  • 56,552
  • 12
  • 145
  • 147
  • Thanks, but unfortunately we don't use .Net 4.0. I wish we could as I have about 20,000 folders to process. – Jay May 11 '10 at 14:58
  • 3
    Nice answer. Instead of `if (String.IsNullOrEmpty(entries.FirstOrDefault()))`, you could also use `if ( ! entries.Any() )`, which is a bit cleaner IMHO. – Danko Durbić May 11 '10 at 15:06
  • 2
    @Danko Durbić, completely agree with you, I didn't notice the overload without parameters and was already asking myself why `Enumerable` didn't have something to quickly check for an empty `IEnumerable`. Thanks, I updated the answer. – João Angelo May 11 '10 at 15:40
  • This is the answer for large directories because GetDirectories is way to slow for decently sized directories. I would say that if you dont have .net 4.0 then you should use API calls from this thread http://stackoverflow.com/questions/1741306/c-sharp-get-file-names-and-last-write-times-for-large-directories – Michael Hohlios Apr 27 '12 at 17:05
12

Running the test on C:\Windows 1000 times on the 3 methods mentioned so far yielded this:

GetFiles+GetDirectories:630ms
GetFileSystemEntries:295ms
EnumerateFileSystemEntries.Any:71ms

Running it on an empty folder yielded this (1000 times again):

GetFiles+GetDirectories:131ms
GetFileSystemEntries:66ms
EnumerateFileSystemEntries.Any:64ms

So EnumerateFileSystemEntries is by far the best overall when you are checking for empty folders.

Wolf5
  • 16,600
  • 12
  • 59
  • 58
5

Here's a version that takes advantage of parallel execution to get it done faster in some cases:

public static void DeleteEmptySubdirectories(string parentDirectory){
  System.Threading.Tasks.Parallel.ForEach(System.IO.Directory.GetDirectories(parentDirectory), directory => {
    DeleteEmptySubdirectories(directory);
    if(!System.IO.Directory.EnumerateFileSystemEntries(directory).Any()) System.IO.Directory.Delete(directory, false);
  });   
}

Here's the same code in single threaded mode:

public static void DeleteEmptySubdirectoriesSingleThread(string parentDirectory){
  foreach(string directory in System.IO.Directory.GetDirectories(parentDirectory)){
    DeleteEmptySubdirectories(directory);
    if(!System.IO.Directory.EnumerateFileSystemEntries(directory).Any()) System.IO.Directory.Delete(directory, false);
  }
}

... and here's some sample code you could use to test results in your scenario:

var stopWatch = new System.Diagnostics.Stopwatch();
for(int i = 0; i < 100; i++) {
  stopWatch.Restart();
  DeleteEmptySubdirectories(rootPath);
  stopWatch.Stop();
  StatusOutputStream.WriteLine("Parallel: "+stopWatch.ElapsedMilliseconds);
  stopWatch.Restart();
  DeleteEmptySubdirectoriesSingleThread(rootPath);
  stopWatch.Stop();
  StatusOutputStream.WriteLine("Single: "+stopWatch.ElapsedMilliseconds);
}

... and here're some results from my machine for a directory that is on a file share across a wide area network. This share currently has only 16 subfolders and 2277 files.

Parallel: 1479
Single: 4724
Parallel: 1691
Single: 5603
Parallel: 1540
Single: 4959
Parallel: 1592
Single: 4792
Parallel: 1671
Single: 4849
Parallel: 1485
Single: 4389
scradam
  • 1,053
  • 11
  • 11
3

From here, Powershell script to remove empty directories:

$items = Get-ChildItem -Recurse

foreach($item in $items)
{
      if( $item.PSIsContainer )
      {
            $subitems = Get-ChildItem -Recurse -Path $item.FullName
            if($subitems -eq $null)
            {
                  "Remove item: " + $item.FullName
                  Remove-Item $item.FullName
            }
            $subitems = $null
      }
}

Note: use at own risk!

Mitch Wheat
  • 295,962
  • 43
  • 465
  • 541
3

If you rely on DirectoryInfo.Delete only deleting empty directories, you can write a succinct extension method

public static void DeleteEmptyDirs(this DirectoryInfo dir)
{
    foreach (DirectoryInfo d in dir.GetDirectories())
        d.DeleteEmptyDirs();

    try { dir.Delete(); }
    catch (IOException) {}
    catch (UnauthorizedAccessException) {}
}

Usage:

static void Main()
{
    new DirectoryInfo(@"C:\temp").DeleteEmptyDirs();
}
Neil
  • 3,899
  • 1
  • 29
  • 25
1
    private static void deleteEmptySubFolders(string ffd, bool deleteIfFileSizeZero=false)
{
    DirectoryInfo di = new DirectoryInfo(ffd);
    foreach (DirectoryInfo diSon in di.GetDirectories("*", SearchOption.TopDirectoryOnly))
    {
        FileInfo[] fis = diSon.GetFiles("*.*", SearchOption.AllDirectories);
        if (fis == null || fis.Length < 1)
        {
            diSon.Delete(true);
        }
        else
        {
            if (deleteIfFileSizeZero)
            {
                long total = 0;
                foreach (FileInfo fi in fis)
                {
                    total = total + fi.Length;
                    if (total > 0)
                    {
                        break;
                    }
                }

                if (total == 0)
                {
                    diSon.Delete(true);
                    continue;
                }
            }

            deleteEmptySubFolders(diSon.FullName, deleteIfFileSizeZero);
        }
    }
}
Mavei
  • 11
  • 1
0
//Recursive scan of empty dirs. See example output bottom

string startDir = @"d:\root";

void Scan(string dir, bool stepBack)
    {
        //directory not empty
        if (Directory.GetFileSystemEntries(dir).Length > 0)
        {
            if (!stepBack)
            {
                foreach (string subdir in Directory.GetDirectories(dir))
                    Scan(subdir, false);
            } 
        }
        //directory empty so delete it.
        else
        {
            Directory.Delete(dir);
            string prevDir = dir.Substring(0, dir.LastIndexOf("\\"));
            if (startDir.Length <= prevDir.Length)
                Scan(prevDir, true);
        }
    }
//call like this
Scan(startDir, false);

/*EXAMPLE outputof d:\root with empty subfolders and one filled with files
   Scanning d:\root
   Scanning d:\root\folder1 (not empty)
   Scanning d:\root\folder1\folder1sub1 (not empty)
   Scanning d:\root\folder1\folder1sub1\folder2sub2 (deleted!)
   Scanning d:\root\folder1\folder1sub1 (deleted!)
   Scanning d:\root\folder1 (deleted)
   Scanning d:\root (not empty)
   Scanning d:\root\folder2 (not empty)
   Scanning d:\root\folder2\folder2sub1 (deleted)
   Scanning d:\root\folder2 (not empty)
   Scanning d:\root\folder2\notempty (not empty) */
Jamil
  • 1
  • 1
  • 1
    This algoritm will delete all empty directories in a given start directory. When a directory is deleted, the previous directory will be re-scanned because it is possible that the previous directory is now empty. – Jamil May 22 '13 at 10:05
  • That is not needed. You should think about the benefits of recursion and maybe slowly debug through the already provided answers. – Oliver May 22 '13 at 10:11
  • True, but what is not recursive about this? I was a little to fast and so i made a small correction. See output example on what this code does. I think this is efficient. – Jamil May 22 '13 at 14:39
  • If you would put the `foreach` above the `if` statement you wouldn't need the `stepBack` parameter which would make your code much easier to read (end your code would end up with the same result as the already given answers). Also you should use `Directory.Enumerate...()` instead of `Directory.Get...()`. Thus your answer doesn't provide anything new to the already given ones and makes it obsolete. – Oliver May 23 '13 at 06:34
  • The reason i believe this parameter makes it more efficient (and that's my contribution although i could be wrong though :) ) is because - only a parent folder of a deleted folder will be re-scanned again, "without rescanning all the sub directories of the parent folder again" - Try it yourself. check the output with and without the stepback parameter. You'll see that without it, folders are scanned multiple times. – Jamil May 23 '13 at 09:52
  • Take João Angelo answer and there also each directory will be scanned only once (without using any parameters for the recursion). – Oliver May 23 '13 at 11:09
0
    foreach (var folder in Directory.GetDirectories(myDir, "*", System.IO.SearchOption.AllDirectories))
    {
        {
            try
            {
                if (Directory.GetFiles(folder, "*", System.IO.SearchOption.AllDirectories).Length == 0)
                    Directory.Delete(folder, true);
            }
            catch { }
        }
    }
Sam
  • 11
  • 2