7

I am not sure why we would use Directory.GetFiles for if Directory.EnumerateFiles would be able to do the same thing and you would be able to enumerate the list even before the whole list of directories found is returned.

What is the difference between Directory.EnumerateFiles vs Directory.GetFiles?

Why, now that EnumerateFiles is available would there be a need to use GetFiles?

janw
  • 8,758
  • 11
  • 40
  • 62
SamIAm
  • 2,241
  • 6
  • 32
  • 51
  • 2
    `GetFiles` is there for backward-compatibility. It was in the original 1.0 .NET framework. `EnumerateFiles` was introduced much later - 4.0 or 4.5 if I recall correctly. – Enigmativity Jul 04 '14 at 04:25
  • 2
    @Enigmativity `GetFiles` was introduced in 2.0 and `EnumerateFiles` in 4.0. – Despertar Jul 04 '14 at 05:16

3 Answers3

10

According to http://msdn.microsoft.com/en-us/library/07wt70x2%28v=vs.110%29.aspx:

The EnumerateFiles and GetFiles methods differ as follows: When you use EnumerateFiles, you can start enumerating the collection of names before the whole collection is returned; when you use GetFiles, you must wait for the whole array of names to be returned before you can access the array. Therefore, when you are working with many files and directories, EnumerateFiles can be more efficient.

I guess GetFiles could be considered as a convenience function.

AlexD
  • 32,156
  • 3
  • 71
  • 65
  • 1
    @GrantWinney According to MSDN, it does not sound that GetFiles is deprecated. It looks more as a convenience function. Although I see no explicit statement about it. – AlexD Jul 04 '14 at 02:59
  • So, essentially, `GetFiles()` is equivalent to `EnumerateFiles().ToList()`. – dwerner Jul 04 '14 at 03:00
  • 3
    @dwerner: Actually, looking at the [reference code](http://referencesource.microsoft.com/mscorlib/R/0e5cd32f1daea6e5.html), `GetFiles()` eventually calls the equivalent `EnumerateFiles` (if I'm reading correctly), then calls: `new List(fileIterator).ToArray()`. If I understand this correctly, this produces _two_ O(n) operations, whereas `EnumerateFiles().ToList()` only would produce _one_. – Chris Sinclair Jul 04 '14 at 03:31
  • @Chris Sinclair - Ouch, even worse. – dwerner Jul 04 '14 at 04:30
  • One point of attention, though. If you are doing file operations in the folder (e.g. renaming files), the same file might be processed multiple files, as it gets "caught" again while the enumerator is traversing the directory. – luizs81 Dec 04 '19 at 02:41
10

You use the right tool for the right job. If you have a small number of files, then you probably want to just use Directory.GetFiles() and load them into memory.

Directory.EnumerateFiles() on the other hand is when you need to conserve memory by enumerating the files one at a time.

This is similar to the argument of whether to make everything lazy, or just use it when appropriate. I think this helps put things into perspective because making everything lazy for no good reason is overkill.

You have the trade off of simplicity vs efficiency. Conceptually, an enumerator which doesn't actually contain the files, but yields them one at a time is more complex than a string array. Being able to store that enumerator in an IEnumerable<string> is a wonderful abstraction, but the point remains.

It may also be more difficult to predict performance. With GetFiles() you know the performance cost is incurred when the method is called. EnumerateFiles() does not do anything until it is enumerated.

Intention is important. If you are using EnumerateFiles() on a directory that you know won't have that many files, then you're just using it blindly without understanding. The computer the code runs on matters, too. If I am running something on a server with 24 GB of RAM then I may be more likely to use memory-hungry methods like GetFiles() than if I was on a server with 1 GB of RAM.

If you have no idea how many files there will be or unsure what kind of environment the code will run on, then it would be wise to err on the side of caution and use the memory-efficient version.

There are many variables to consider and a good programmer takes these things into consideration and does things purposefully.

Community
  • 1
  • 1
Despertar
  • 21,627
  • 11
  • 81
  • 79
4

Both of these 2 methods ultimately make a call to System.IO.FileSystemEnumerableFactory.CreateFileNameIterator(). The only difference seems to be that the GetFiles() result gets wrapped in a List and then turned into an array. So as mentioned in some other answers, the difference is that the enumerator has been traversed already via GetFiles() versus EnumerateFiles which gives you the enumerator before traversing it.

For reference I was able to find this using ILSpy (http://ilspy.net/)

bingles
  • 11,582
  • 10
  • 82
  • 93
  • 2
    Actually, for many classes in the BCL, you can see the source code via the online [reference source](http://referencesource.microsoft.com/#mscorlib/system/io/directory.cs); no need to ILSpy here. – Chris Sinclair Jul 04 '14 at 03:36