20

I need to calculate the size of hundreds of folders, some will be 10MB some maybe 10GB, I need a super fast way of getting the size of each folder using C#.

My end result will hopefully be:

Folder1 10.5GB

Folder2 230MB

Folder3 1.2GB

...

John Saunders
  • 160,644
  • 26
  • 247
  • 397
LeeW
  • 513
  • 4
  • 7
  • 11
  • 4
    There's no way to do this in C#. C# has no features for accessing the file system. You're going to have to use the .NET Framework or the Win32 API. – John Saunders May 19 '10 at 21:32
  • Hi John, How would I do this using .Net or the Win32 API? Any idea which is quickest or are both slow? – LeeW May 19 '10 at 21:36
  • 6
    @john: I think "C# has no features for accessing the file system" could be construed incorrectly, mainly because of System.IO. I understand what you mean, but to the passerby it might imply something otherwise. – Ta01 May 19 '10 at 21:50
  • 1
    @RandomNoob: I hope the passersby take it as the fact that C# is not the same as the .NET Framework. – John Saunders May 19 '10 at 22:01
  • 6
    @John Saunders: that is a particularly pendantic point. The OP already tagged the question 'filesystem' and '.net'. If I were a VB.net programmer, I'd likely phrase the question in terms of VB.net in order to solicit answers written in VB.net rather than C#. – JBRWilkinson May 19 '10 at 22:11
  • @JBR: I'm aware it's pedantic - If VB.NET were in the title, I'd remove it as well, and make sure it was in the tags. A particular annoyance was that in the title, it said "Using C# what is the fastest way", which I just now realized is not an instance of the problem I was trying to solve. This is not one of those posts that say, "C# Regex" or "VB.NET Threading", and I should have fixed the title without the pedantic comment (which I now wish I could delete). – John Saunders May 19 '10 at 22:24
  • it's a shame that the Windows Filesystem doesn't cheat like MacOS does and it internally stores the size of the directory as a Filesystemobject against each directory and then updates that value for you as files/directories are added/removed. Then there is no calculation required, just read the value... but I digress :-( – Paul Farry May 19 '10 at 22:45
  • @Paul: how dows the MAC handle clusters, or remote folders, whose size may change out from under it? – John Saunders May 20 '10 at 02:46
  • @John: it's been quite some time since I wrote apps on MacOS. SAN, Cluster etc, didn't need to be considered, it was either Local or Mount off Server. Without knowing for sure if this feature even still exists, you'd expect the server manages that as filesystems move etc. – Paul Farry May 20 '10 at 02:53
  • @Paul: nice ring. However, based on my experience with emulating Mac file systems on foreign platforms, I'd guess the answer is that this value is faked, and should not be depended upon. I had that issue when the foreign file system couldn't know some quantity until the file was read - yet you had to lie to the Mac to say that you knew. – John Saunders May 20 '10 at 03:46
  • @LeeW, please check my answer given below. I know it's really late, but I found some time yesterday/today to solve your problem. I have edited it into my original post. – BoltBait Sep 27 '12 at 18:12

9 Answers9

37

Add a reference to the Microsoft Scripting Runtime and use:

Scripting.FileSystemObject fso = new Scripting.FileSystemObject();
Scripting.Folder folder = fso.GetFolder([folder path]);
Int64 dirSize = (Int64)folder.Size;

If you just need the size, this is much faster than recursing.

Dave James
  • 371
  • 3
  • 2
  • 1
    Example of how MUCH faster for a directory with 900K files and a size of 9.5 GB: this method: 250ms, recursive method:~15 seconds. – JRadness Apr 03 '13 at 19:12
  • D*nm! Did this, it is about 90% faster! Absolutely amazing! If I could vote up more than once I did it as this is an amazing thing. Any comments about if there's a catch with this COM reference? – Akku Jan 02 '14 at 15:33
  • I am not aware about "Microsoft Scripting Runtime". I have searched google but didn't find any satisfactory answer. Could you please assist me how can I add reference to "Microsoft Scripting Runtime"? I am using Visual Studio 2013 and Framework 4.5. – Banketeshvar Narayan Jan 16 '15 at 06:46
  • 2
    You can also add the reference by going to Add Reference > COM > Microsoft Scripting Runtime – Ricketts Mar 21 '15 at 01:29
  • Size is `dynamic {double}` in the Locals window `var size = new Scripting.FileSystemObject().GetFolder(path).Size;` – Slai Jul 13 '16 at 17:58
  • please add System.Runtime.InteropServices.Marshal.ReleaseComObject(folder); System.Runtime.InteropServices.Marshal.ReleaseComObject(fso); – Bernhard Apr 25 '18 at 13:00
  • although the size is dynamic, it internally uses "double" – Bernhard Apr 25 '18 at 14:02
  • I'm having trouble with this when passing a drive letter; e.g. C: or D:. Anyone else? – windowsgm Sep 02 '19 at 16:06
  • and now uses internally object{int} ? – Bernhard Sep 05 '19 at 12:34
  • Does anyone know what API FileSystemObject.Size is using to get/calculate the size? – Jack Ukleja Jun 02 '21 at 16:56
13

OK, this is terrible, but...

Use a recursive dos batch file called dirsize.bat:

@ECHO OFF
IF %1x==x GOTO start
IF %1x==DODIRx GOTO dodir
SET CURDIR=%1
FOR /F "usebackq delims=" %%A IN (`%0 DODIR`) DO SET ANSWER=%%A %CURDIR%
ECHO %ANSWER%
GOTO end
:start
FOR /D %%D IN (*.*) DO CALL %0 "%%D"
GOTO end
:dodir
DIR /S/-C %CURDIR% | FIND "File(s)"
GOTO end
:end

Note: There should be a tab character after the final "%%A" on line 5, not spaces.

This is the data you're looking for. It will do thousands of files fairly quickly. In fact, it does my entire harddrive in less than 2 seconds.

Execute the file like this dirsize | sort /R /+25 in order to see the largest directory listed first.

Good luck.

BoltBait
  • 11,361
  • 9
  • 58
  • 87
  • 1
    Doesn't seem to work for me... when run against my harddrive (`cd C:/`, then `dirsize.bat`) it takes around a minute and echoes "ECHO is disabled" 9 times. (about the time it takes, it's not an SSD but a hybrid, and I have 505k files and 233k folders) – Camilo Martin Jun 14 '14 at 22:47
1

The fastest approach on 4.0-4.5 framework which I could find to calculate files size and their count on disk was:

using System.IO;
using System.Threading;
using System.Threading.Tasks;

class FileCounter
{
  private readonly int _clusterSize;
  private long _filesCount;
  private long _size;
  private long _diskSize;

  public void Count(string rootPath)
  {
    // Enumerate files (without real execution of course)
    var filesEnumerated = new DirectoryInfo(rootPath)
                              .EnumerateFiles("*", SearchOption.AllDirectories);
    // Do in parallel
    Parallel.ForEach(filesEnumerated, GetFileSize);
  }

  /// <summary>
  /// Get real file size and add to total
  /// </summary>
  /// <param name="fileInfo">File information</param>
  private void GetFileSize(FileInfo fileInfo)
  {
    Interlocked.Increment(ref _filesCount);
    Interlocked.Add(ref _size, fileInfo.Length);
  }
}

var fcount = new FileCounter("F:\\temp");
fcount.Count();

This approach appeared for me as the best which I could find on .net platform. Btw if you need to calculate cluster size and real size on disk, you can do next:

using System.Runtime.InteropServices;

private long WrapToClusterSize(long originalSize)
    {
        return ((originalSize + _clusterSize - 1) / _clusterSize) * _clusterSize;
    }

private static int GetClusterSize(string rootPath)
    {
        int sectorsPerCluster = 0, bytesPerSector = 0, numFreeClusters = 0, totalNumClusters = 0;
        if (!GetDiskFreeSpace(rootPath, ref sectorsPerCluster, ref bytesPerSector, ref numFreeClusters,
                              ref totalNumClusters))
        {
            // Satisfies rule CallGetLastErrorImmediatelyAfterPInvoke.
            // see http://msdn.microsoft.com/en-us/library/ms182199(v=vs.80).aspx
            var lastError = Marshal.GetLastWin32Error();
            throw new Exception(string.Format("Error code {0}", lastError));
        }
        return sectorsPerCluster * bytesPerSector;
    }
[DllImport(Kernel32DllImport, SetLastError = true)]
    private static extern bool GetDiskFreeSpace(
        string rootPath,
        ref int sectorsPerCluster,
        ref int bytesPerSector,
        ref int numFreeClusters,
        ref int totalNumClusters);

And of course you need to rewrite GetFileSize() in first code section:

private long _diskSize;
private void GetFileSize(FileInfo fileInfo)
    {
        Interlocked.Increment(ref _filesCount);
        Interlocked.Add(ref _size, fileInfo.Length);
        Interlocked.Add(ref _diskSize, WrapToClusterSize(fileInfo.Length));
    }
framerelay
  • 526
  • 4
  • 5
1

There is no simple way to do this in .Net; you will have to loop through every file and subdir. See the examples here to see how it's done.

Jouke van der Maas
  • 4,117
  • 2
  • 28
  • 35
  • It seems slow in Windows too so maybe there isn't a fast method, calculating the size for 100s or 1000s of large folders may not be feasible then :-( – LeeW May 19 '10 at 21:34
  • After reading all the comments I've decided against doing this, is was to be a nice to have feature but the overhead is to great. Thanks all. – LeeW May 19 '10 at 22:22
1

You can do something like this, but there's no fast=true setting when it comes to getting folder sizes, you have to add up the file sizes.

    private static IDictionary<string, long> folderSizes;

    public static long GetDirectorySize(string dirName)
    {
        // use memoization to keep from doing unnecessary work
        if (folderSizes.ContainsKey(dirName))
        {
            return folderSizes[dirName];
        }

        string[] a = Directory.GetFiles(dirName, "*.*");

        long b = 0;
        foreach (string name in a)
        {
            FileInfo info = new FileInfo(name);
            b += info.Length;
        }

        // recurse on all the directories in current directory
        foreach (string d in Directory.GetDirectories(dirName))
        {
            b += GetDirectorySize(d);
        }

        folderSizes[dirName] = b;
        return b;
    }

    static void Main(string[] args)
    {
        folderSizes = new Dictionary<string, long>();
        GetDirectorySize(@"c:\StartingFolder");
        foreach (string key in folderSizes.Keys)
        {
            Console.WriteLine("dirName = " + key + " dirSize = " + folderSizes[key]);
        }

        // now folderSizes will contain a key for each directory (starting
        // at c:\StartingFolder and including all subdirectories), and
        // the dictionary value will be the folder size
    }
dcp
  • 54,410
  • 22
  • 144
  • 164
  • Where is the initial call to GetDirectorySize()? Without this, the code does nothing as folderSizes is empty. – JBRWilkinson May 19 '10 at 22:06
  • Also, folderSizes will also contain all subdirectories whereas it seems the OP just wants the sizes of the top level. – JBRWilkinson May 19 '10 at 22:07
  • @JBRWilkinson - Yep, in one of my edits I accidentally took out that initial call. Thanks for pointing it out. The dictionary will contain all the results, but the OP can use the ones he/she needs. – dcp May 19 '10 at 22:33
1

If you right click a large directory then properties you can see that it takes significant amount of time to calculate the size... I don't think we can beat MS in this. One thing you could do is index the sizes of directories/subdirs, if you are going to calculate them over and over again... that would significantly increase the speed.

You could use something like this to calculate directory size in C# recursively

static long DirSize(DirectoryInfo directory)
{
    long size = 0;

    FileInfo[] files = directory.GetFiles();
    foreach (FileInfo file in files)
    {
        size += file.Length;
    }

    DirectoryInfo[] dirs = directory.GetDirectories();

    foreach (DirectoryInfo dir in dirs)
    {
        size += DirSize(dir);
    }

    return size;
}
John Saunders
  • 160,644
  • 26
  • 247
  • 397
m0s
  • 4,250
  • 9
  • 41
  • 64
1

Dot Net Pearls has a method similar to the ones described here. It's surprising that the System.IO.DirectoryInfo class doesn't have a method to do this since it seems like a common need and it probably would be faster to do it without doing a native/managed transition on each file system object. I do think that if speed is the key thing, writing a non-managed object to do this calculation and then call it once per directory from managed code.

Mike Kelly
  • 969
  • 1
  • 12
  • 23
0

There are some leads in this link (though it's in Python) from a person running into similar performance issues. You can try calling down into Win32 API to see if performance improves, but at the end you're going to run into the same issue: a task can only be done so quickly and if you have to do the task a lot of times, it will take a lot of time. Can you give more detail on what you're doing this for? It might help folks come up with a heuristic or some cheats to help you. If you're doing this calculation a lot, are you caching the results?

Tom
  • 22,301
  • 5
  • 63
  • 96
-1

I'm quite sure that this will be slow as hell, but I'd write it like this:

using System.IO;

long GetDirSize(string dir) {
   return new DirectoryInfo(dir)
      .GetFiles("", SearchOption.AllDirectories)
      .Sum(p => p.Length);
}
santa
  • 983
  • 9
  • 30