2

I think I formerly knew how to do this, but it seems that I have forgotten.

I have a recursive algorithm to remove all empty directories in a directory tree:

static bool DeleteDirectoriesRecursive(string path)
{
    var remove = true;
    foreach (var dir in System.IO.Directory.GetDirectories(path))
    {
        remove &= DeleteDirectoriesRecursive(dir);
    }
    if (remove &= (System.IO.Directory.GetFiles(path).Length == 0))
        System.IO.Directory.Delete(path);
    return remove;
}

I'm trying to eliminate the recursion from this algorithm, not so much to "fix" the algorithm (i.e., the similar question doesn't use the remove variable, but I would like to keep it).

I have started a new function, using the Stack<> class, but I can't think of a good way to return to the base path and take the actions that the sub-directories have determined. I guess unraveling non-tail recursion takes a bit more effort.

Community
  • 1
  • 1
palswim
  • 11,856
  • 6
  • 53
  • 77
  • Why would you want to replace one stack (the IL stack) with another (yours)? What do you stand to gain from this? – zmbq Mar 02 '12 at 19:38
  • Knowledge. Nobody said I'm going to do this in production code. – palswim Mar 02 '12 at 19:40
  • @zmbq - That's what I was thinking. This is actually a very good use of recursion as it uses the call stack to crawl a tree. Attempting to do the same thing with your own stack will just make the code much longer, more difficult to understand, and won't offer any performance benefit either. – Steve Wortham Mar 02 '12 at 19:44
  • 1
    @SteveWortham: Most uses of recursion are good, but sometimes you have to implement it with an explicit stack because overflowing the IL stack is fatal. – Gabe Mar 02 '12 at 20:02
  • @Gabe - Good point. The directory structure would have to be pretty massive for you to run into a StackOverflowException, but I guess it's possible. – Steve Wortham Mar 02 '12 at 20:09
  • @SteveWortham: You're right. Parsing XML or LISP would be a better use of this technique in production code, but this example is much simpler so it is easier to learn the technique with. – Gabe Mar 02 '12 at 20:15

2 Answers2

1

The sans-recursion version is much more difficult and not much more efficient, so I would just recommend keeping your recursion. Also, here's a version that's a little cleaner:

static bool DeleteDirectoriesRecursive(DirectoryInfo d)
{
    bool remove = true;

    foreach (DirectoryInfo c in d.GetDirectories())
    {
        remove &= DeleteDirectoriesRecursive(c);
    }

    if (remove && d.GetFiles().Length == 0) {
        d.Delete();
        return true;
    }

    return false;
}
Ry-
  • 218,210
  • 55
  • 464
  • 476
  • That *is* a bit cleaner. Though, you still have to find a place to set the remove to `false` if necessary (`remove &= (d.GetFiles().Length == 0)` somewhere). – palswim Mar 02 '12 at 19:53
1

Since Gabe mentioned the possibility of a StackOverflowException when using recursion I was inspired to make this work without it. I used the code here as a starting point. And here's what I came up with...

public static bool DeleteDirectories(string root)
{
    bool removed = false;

    var dirs = new Stack<string>();
    var emptyDirStack = new Stack<string>();
    var emptyDirs = new Dictionary<string, int>();

    if (!System.IO.Directory.Exists(root))
    {
        throw new ArgumentException();
    }
    dirs.Push(root);

    while (dirs.Count > 0)
    {
        string currentDir = dirs.Pop();

        string[] subDirs;
        try
        {
            subDirs = System.IO.Directory.GetDirectories(currentDir);
        }
        catch (UnauthorizedAccessException e)
        {
            Console.WriteLine(e.Message);
            continue;
        }
        catch (System.IO.DirectoryNotFoundException e)
        {
            Console.WriteLine(e.Message);
            continue;
        }

        if (Directory.GetFiles(currentDir).Length == 0)
        {
            emptyDirStack.Push(currentDir);
            emptyDirs.Add(currentDir, subDirs.Length); // add directory path and number of sub directories
        }

        // Push the subdirectories onto the stack for traversal.
        foreach (string str in subDirs)
            dirs.Push(str);
    }

    while (emptyDirStack.Count > 0)
    {   
        string currentDir = emptyDirStack.Pop();
        if (emptyDirs[currentDir] == 0)
        {
            string parentDir = Directory.GetParent(currentDir).FullName;
            Console.WriteLine(currentDir); // comment this line
            //Directory.Delete(currentDir); // uncomment this line to delete
            if (emptyDirs.ContainsKey(parentDir))
            {
                emptyDirs[parentDir]--; // decrement number of subdirectories of parent
            }
            removed = true;
        }
    }

    return removed;
}

A few notes:

  1. Crawling a tree without recursion isn't too incredibly difficult and it's accomplished in the MSDN article I linked to above. But this particular example is a little more complex than theirs.
  2. I'm storing every directory which contains no files in the emptyDirs Dictionary, along with the number of subdirectories it contains.
  3. Since the order in which the directories are deleted is critical, I had to run through the emptyDirs Dictionary in reverse order (I used Linq to accomplish this).

I imagine there are more efficient methods but this isn't too bad, and it seems to work in my testing.

UPDATE : I got rid of the reversed dictionary in favor of a 2nd stack. I still use the dictionary for fast lookups, however.

Steve Wortham
  • 21,740
  • 5
  • 68
  • 90
  • You absolutely should not rely on any kind of ordering in a `Dictionary`. There are cases when it will return them in the order you inserted them in, but you shouldn't rely on that, because it doesn't always work. – svick Mar 02 '12 at 22:57
  • @svick - Good to know. I just ditched the reversed dictionary. – Steve Wortham Mar 02 '12 at 23:03
  • Yeah, I've started to avoid using the `Dictionary` class as much as I have used it. I can accomplish a lot of the same things with a `List>` class. – palswim Mar 02 '12 at 23:50
  • @palswim - The Dictionary is faster when it comes to searching a large collection. But then again, if I had used a List like you mentioned I gain the ability to iterate through it in reverse order and wouldn't need the 2nd stack. It's probably one of those things where you should decide if you want to optimize for speed or for memory. – Steve Wortham Mar 03 '12 at 00:00