2

I need to be able to get all files from a directory and sub directories, but I would like to give the user the option to choose the depth of sub-directories. I.e., not just current directory or all directories, but he should be able to choose a depth of 1,2,3,4 directories etc.

I've seen many examples of walking through directory trees and none of them seemed to address this issue. And personally, I get confused with recursion... (which I currently use). I am not sure how I would track the depth during a recursive function.

Any help would be greatly appreciated.

Thanks, David

Here is my current code (which I found here):

    static void FullDirList(DirectoryInfo dir, string searchPattern, string excludeFolders, int maxSz, string depth)
    {

        try
        {
            foreach (FileInfo file in dir.GetFiles(searchPattern))
            {

                if (excludeFolders != "")
                    if (Regex.IsMatch(file.FullName, excludeFolders, RegexOptions.IgnoreCase)) continue;

                myStream.WriteLine(file.FullName);
                MasterFileCounter += 1;

                if (maxSz > 0 && myStream.BaseStream.Length >= maxSz)
                {
                    myStream.Close();
                    myStream = new StreamWriter(nextOutPutFile());
                }

            }
        }
        catch
        {
            // make this a spearate streamwriter to accept files that failed to be read.
            Console.WriteLine("Directory {0}  \n could not be accessed!!!!", dir.FullName);
            return;  // We alredy got an error trying to access dir so dont try to access it again
        }

        MasterFolderCounter += 1;

        foreach (DirectoryInfo d in dir.GetDirectories())
        {
            //folders.Add(d);
            // if (MasterFolderCounter > maxFolders) 
            FullDirList(d, searchPattern, excludeFolders, maxSz, depth);
        }

    }
Community
  • 1
  • 1
DaveyD
  • 337
  • 1
  • 5
  • 15
  • 1
    http://stackoverflow.com/questions/6141648/list-of-all-folders-and-files-for-x-depth might give you a hint – Thorarins Jul 06 '15 at 11:48
  • 1
    Notice recursion? Add parameter to recursive function, increment it on each subfolder call. This way you know how deep you are. – Sinatr Jul 06 '15 at 11:51
  • @Thorarins, Thanks! I cant believe I didnt find that - I searched tons!! - I am looking into it now and will post back – DaveyD Jul 06 '15 at 13:56

2 Answers2

3

using a maxdepth varibale that could be decremented each recursive call and then you cannot just return once reached the desired depth.

static void FullDirList(DirectoryInfo dir, string searchPattern, string excludeFolders, int maxSz, int maxDepth)
{

    if(maxDepth == 0)
    {
        return;
    }

    try
    {
        foreach (FileInfo file in dir.GetFiles(searchPattern))
        {

            if (excludeFolders != "")
                if (Regex.IsMatch(file.FullName, excludeFolders, RegexOptions.IgnoreCase)) continue;

            myStream.WriteLine(file.FullName);
            MasterFileCounter += 1;

            if (maxSz > 0 && myStream.BaseStream.Length >= maxSz)
            {
                myStream.Close();
                myStream = new StreamWriter(nextOutPutFile());
            }

        }
    }
    catch
    {
        // make this a spearate streamwriter to accept files that failed to be read.
        Console.WriteLine("Directory {0}  \n could not be accessed!!!!", dir.FullName);
        return;  // We alredy got an error trying to access dir so dont try to access it again
    }

    MasterFolderCounter += 1;

    foreach (DirectoryInfo d in dir.GetDirectories())
    {
        //folders.Add(d);
        // if (MasterFolderCounter > maxFolders) 
        FullDirList(d, searchPattern, excludeFolders, maxSz, depth - 1);
    }

}
DasDave
  • 801
  • 1
  • 9
  • 28
  • 1
    Rather than passing `depth` and `maxDepth` arguments, simply pass `maxDepth`, decrement by 1 on each recursion and check it is > 0 – Tim Rogers Jul 06 '15 at 11:54
  • Changed to use @TimRogers suggestion – DasDave Jul 06 '15 at 11:56
  • Hi guys, thanks for your response. I've tried this out but it doesnt seem to be working. I am looking into it - will post back soon. - THANKS! – DaveyD Jul 06 '15 at 13:55
  • Ok, sorry guys - it works perfectly. I always get confused when it comes to recursion... but now its pretty simple! Thanks a lot! – DaveyD Jul 06 '15 at 17:46
2

Let's start out by refactoring the code a little bit to make its work a little easier to understand.

So, the key exercise here is to recursively return all of the files that match the patterns required, but only to a certain depth. Let's get those files first.

public static IEnumerable<FileInfo> GetFullDirList(
    DirectoryInfo dir, string searchPattern, int depth)
{
    foreach (FileInfo file in dir.GetFiles(searchPattern))
    {
        yield return file;
    }

    if (depth > 0)
    {
        foreach (DirectoryInfo d in dir.GetDirectories())
        {
            foreach (FileInfo f in GetFullDirList(d, searchPattern, depth - 1))
            {
                yield return f;
            }
        }
    }
}

This is just simplified the job of recursing for your files.

But you'll notice that it didn't exclude files based on the excludeFolders parameter. Let's tackle that now. Let's start building FullDirList.

The first line would be

    var results =
        from fi in GetFullDirList(dir, searchPattern, depth)
        where String.IsNullOrEmpty(excludeFolders)
            || !Regex.IsMatch(fi.FullName, excludeFolders, RegexOptions.IgnoreCase)
        group fi.FullName by fi.Directory.FullName;

This goes and gets all of the files, restricts them against excludeFolders and then groups all the files by the folders they belong to. We do this so that we can do this next:

    var directoriesFound = results.Count();
    var filesFound = results.SelectMany(fi => fi).Count();

Now I noticed that you were counting MasterFileCounter & MasterFolderCounter.

You could easily write:

    MasterFolderCounter+= results.Count();
    MasterFileCounter += results.SelectMany(fi => fi).Count();

Now, to write out these files it appears you are trying to aggregate the file names into separate files, but keeping a maximum length (maxSz) of the file.

Here's how to do that:

    var aggregateByLength =
        results
            .SelectMany(fi => fi)
            .Aggregate(new [] { new StringBuilder() }.ToList(),
                (sbs, s) =>
                {
                    var nl = s + Environment.NewLine;
                    if (sbs.Last().Length + nl.Length > maxSz)
                    {
                        sbs.Add(new StringBuilder(nl));
                    }
                    else
                    {
                        sbs.Last().Append(nl);
                    }
                    return sbs;
                });

Writing the files now becomes extremely simple:

    foreach (var sb in aggregateByLength)
    {
        File.WriteAllText(nextOutPutFile(), sb.ToString());
    }

So, the full thing becomes:

static void FullDirList(
    DirectoryInfo dir, string searchPattern, string excludeFolders, int maxSz, int depth)
{
    var results =
        from fi in GetFullDirList(dir, searchPattern, depth)
        where String.IsNullOrEmpty(excludeFolders)
            || !Regex.IsMatch(fi.FullName, excludeFolders, RegexOptions.IgnoreCase)
        group fi.FullName by fi.Directory.FullName;

    var directoriesFound = results.Count();
    var filesFound = results.SelectMany(fi => fi).Count();

    var aggregateByLength =
        results
            .SelectMany(fi => fi)
            .Aggregate(new [] { new StringBuilder() }.ToList(),
                (sbs, s) =>
                {
                    var nl = s + Environment.NewLine;
                    if (sbs.Last().Length + nl.Length > maxSz)
                    {
                        sbs.Add(new StringBuilder(nl));
                    }
                    else
                    {
                        sbs.Last().Append(nl);
                    }
                    return sbs;
                });

    foreach (var sb in aggregateByLength)
    {
        File.WriteAllText(nextOutPutFile(), sb.ToString());
    }
}
Enigmativity
  • 113,464
  • 11
  • 89
  • 172
  • Wow! what language is this!! ☺ - Thanks for the work and the detailed explanation. I am going to try to learn this and understand what you wrote. - Is this faster/slower than the previous method? – DaveyD Jul 06 '15 at 17:49
  • 1
    @DaveyD - Unless you have hundreds of thousands of files you probably wouldn't spot a speed difference. My code should me more maintainable as it splits out the recursive code and linearizes the iterative code. Oh, yes, it is c#. – Enigmativity Jul 07 '15 at 00:39
  • Ok, thanks - talking more like 2-3000. I do like the setup of your code, its very nice. I am still trying to figure out what you wrote! This looks somewhat like sql.... – DaveyD Jul 07 '15 at 18:02
  • @DaveyD - The `results` part is using LINQ, which is like SQL. Have you used LINQ before? – Enigmativity Jul 07 '15 at 22:23
  • I understand. No, I've never used linq before. Seems very interesting powerful and useful!!. I dont know much sql either. I do know a little to get around, but far from comfortable. – DaveyD Jul 08 '15 at 04:56