0

Suppose I wanted to enumerate through a given directory's files & directories iteratively, in such a way that the inner directories and files are given out first.

If you were to execute this enumerable in a foreach loop with the functions: DeleteFile and DeleteEmptyDirectory it should not fail, because the most inner items are yielded out first.

Now of course one could do this (pseudocode):

func(Directory dir)
    foreach (var f in dir.EnumerateFileSystemInfos())
        if (f is FileInfo)
            yield return f;
        else
            foreach (var item in func(f))
                yield return item;

    yield return dir;

But that has wastes a lot of allocations on enumerators.

One could also use two stacks and "acquire" all of the directories together and then just keep popping out items but that wastes too much memory and doesn't balance the MoveNext times that well.

Any ideas on how to do this efficiently without wasting too much space?

hl3mukkel
  • 5,369
  • 3
  • 21
  • 21
  • This code does no go through files in sub directories. Don't you want to go through files in sub directories? – Yacoub Massad Dec 23 '15 at 17:07
  • @Yacoub Massad I'd say no, he specifies "iteratively" – Leonardo Spina Dec 23 '15 at 17:08
  • Sorry, I do want to go through files in sub directories, I should remove that piece of code, didnt really try to go through it recursively because I am building a state machine, it was pseudocode more or less, but yes it was wrong – hl3mukkel Dec 23 '15 at 17:09
  • Ok then you mean "recursively", not iteratively. It's clearer now ;-) – Leonardo Spina Dec 23 '15 at 17:15
  • Did you actually test this and measure an inefficiency with using memory? – Yacoub Massad Dec 23 '15 at 17:15
  • I do mean iteratively preferably, that example that I gave in pseudocode is bad, that's why I don't want to use it :-D Recursively would be much easier but it's hard to include in my "state machine" if it is recursive :) – hl3mukkel Dec 23 '15 at 17:15
  • I am not sure if your code will get inner files first. But your question is not about that. Your question is about performance, right? – Yacoub Massad Dec 23 '15 at 17:17
  • @Yacoub Massad It has fairly similar characteristics to this: http://stackoverflow.com/questions/3969963/when-not-to-use-yield-return & Yes, performance and space efficiency, I do not like that solution. – hl3mukkel Dec 23 '15 at 17:17

1 Answers1

1

In the spirit of DRY principle, I would use the function from my answer to How to make an IEnumerable of values of tree nodes?

public static class TreeHelper
{
    public static IEnumerable<T> Traverse<T>(T node, Func<T, IEnumerable<T>> childrenSelector, bool preOrder = true)
    {
        var stack = new Stack<IEnumerator<T>>();
        var e = Enumerable.Repeat(node, 1).GetEnumerator();
        try
        {
            while (true)
            {
                while (e.MoveNext())
                {
                    var item = e.Current;
                    var children = childrenSelector(item);
                    if (children == null)
                        yield return item;
                    else
                    {
                        if (preOrder) yield return item;
                        stack.Push(e);
                        e = children.GetEnumerator();
                    }
                }
                if (stack.Count == 0) break;
                e.Dispose();
                e = stack.Pop();
                if (!preOrder) yield return e.Current;
            }
        }
        finally
        {
            e.Dispose();
            while (stack.Count != 0) stack.Pop().Dispose();
        }
    }
}

Utilizing it for your case would be something like this

var directory = new DirectoryInfo(path);
var result = TreeHelper.Traverse<FileSystemInfo>(directory, fsi => 
    fsi is DirectoryInfo ? ((DirectoryInfo)fsi).EnumerateFileSystemInfos() : null,
    preOrder: false)
Community
  • 1
  • 1
Ivan Stoev
  • 195,425
  • 15
  • 312
  • 343