4

I can count all the files in a folder and sub-folders, the folders themselves are not counted.

(gci -Path *Fill_in_path_here* -Recurse -File | where Name -like "*STB*").Count

However, powershell is too slow for the amount of files (up to 700k). I read that cmd is faster in executing this kind of task.

Unfortunately I have no knowledge of cmd code at all. In the example above I am counting all the files with STB in the file name.

That is what I would like to do in cmd as well.

Any help is appreciated.

mklement0
  • 382,024
  • 64
  • 607
  • 775
InPanic
  • 155
  • 1
  • 14

6 Answers6

9

Theo's helpful answer based on direct use of .NET ([System.IO.Directory]::EnumerateFiles()) is the fastest option (in my tests; YMMV - see the benchmark code below[1]).

Its limitations in the .NET Framework (FullCLR) - on which Windows PowerShell is built - are:

  • An exception is thrown when an inaccessible directory is encountered (due to lack of permissions). You can catch the exception, but you cannot continue the enumeration; that is, you cannot robustly enumerate all items that you can access while ignoring those that you cannot.

  • Hidden items are invariably included.

  • With recursive enumeration, symlinks / junctions to directories are invariably followed.

By contrast, the cross-platform .NET Core framework, since v2.1 - on which PowerShell Core is built - offers ways around these limitations, via the EnumerationOptions options - see this answer for an example.

Note that you can also perform enumeration via the related [System.IO.DirectoryInfo] type, which - similar to Get-ChildItem - returns rich objects rather than mere path strings, allowing for much for versatile processing; e.g., to get an array of all file sizes (property .Length, implicitly applied to each file object):

([System.IO.DirectoryInfo] $somePath).EnumerateFiles('*STB*', 'AllDirectories').Length

A native PowerShell solution that addresses these limitations and is still reasonably fast is to use Get-ChildItem with the -Filter parameter.

(Get-ChildItem -LiteralPath $somePath -Filter *STB* -Recurse -File).Count
  • Hidden items are excluded by default; add -Force to include them.

  • To ignore permission problems, add -ErrorAction SilentlyContinue or -ErrorAction Ignore; the advantage of SilentlyContinue is that you can later inspect the $Error collection to determine the specific errors that occurred, so as to ensure that the errors truly only stem from permission problems.

  • In Windows PowerShell, Get-ChildItem -Recurse invariably follows symlinks / junctions to directories, unfortunately; more sensibly, PowerShell Core by default does not, and offers opt-in via -FollowSymlink.

  • Like the [System.IO.DirectoryInfo]-based solution, Get-ChildItem outputs rich objects ([System.IO.FileInfo] / [System.IO.DirectoryInfo]) describing each enumerated file-system item, allowing for versatile processing.

Note that while you can also pass wildcard arguments to -Path (the implied first positional parameter) and -Include (as in TobyU's answer), it is only -Filter that provides significant speed improvements, due to filtering at the source (the filesystem driver), so that PowerShell only receives the already-filtered results; by contrast, -Path / -Include must first enumerate everything and match against the wildcard pattern afterwards.[2]

Caveats re -Filter use:

  • Its wildcard language is not the same as PowerShell's; notably, it doesn't support character sets/ranges (e.g. *[0-9]) and it has legacy quirks - see this answer.
  • It only supports a single wildcard pattern, whereas -Include supports multiple (as an array).

That said, -Filter processes wildcards the same way as cmd.exe's dir.


Finally, for the sake of completeness, you can adapt MC ND's helpful answer based on cmd.exe's dir command for use in PowerShell, which simplifies matters:

(cmd /c dir /s /b /a-d "$somePath/*STB*").Count

PowerShell captures an external program's stdout output as an array of lines, whose element count you can simply query with the .Count (or .Length) property.

That said, this may or may not be faster than PowerShell's own Get-ChildItem -Filter, depending on the filtering scenario; also note that dir /s can only ever return path strings, whereas Get-ChildItem returns rich objects whose properties you can query.

Caveats re dir use:

  • /a-d excludes directories, i.e., only reports files, but then also includes hidden files, which dir doesn't do by default.

  • dir /s invariably descends into hidden directories too during the recursive enumeration; an /a (attribute-based) filter is only applied to the leaf items of the enumeration (only to files in this case).

  • dir /s invariably follows symlinks / junctions to other directories (assuming it has the requisite permissions - see next point).

  • dir /s quietly ignores directories or symlinks / junctions to directories if it cannot enumerate their contents due to lack of permissions - while this is helpful in the specific case of the aforementioned hidden system junctions (you can find them all with cmd /c dir C:\ /s /ashl), it can cause you to miss the content of directories that you do want to enumerate, but can't for true lack of permissions, because dir /s will give no indication that such content may even exist (if you directly target an inaccessible directory, you get a somewhat misleading File Not Found error message, and the exit code is set to 1).


Performance comparison:

  • The following tests compare pure enumeration performance without filtering, for simplicity, using a sizable directory tree assumed to be present on all systems, c:\windows\winsxs; that said, it's easy to adapt the tests to also compare filtering performance.

  • The tests are run from PowerShell, which means that some overhead is introduced by creating a child process for cmd.exe in order to invoke dir /s, though (a) that overhead should be relatively low and (b) the larger point is that staying in the realm of PowerShell is well worthwhile, given its vastly superior capabilities compared to cmd.exe.

  • The tests use function Time-Command, which can be downloaded from this Gist, which averages 10 runs by default.

# Warm up the filesystem cache for the target dir.,
# both from PowerShell and cmd.exe, to be safe.
gci 'c:\windows\winsxs' -rec >$null; cmd /c dir /s 'c:\windows\winsxs' >$null

Time-Command `
  { @([System.IO.Directory]::EnumerateFiles('c:\windows\winsxs', '*', 'AllDirectories')).Count },
  { (Get-ChildItem -Force -Recurse -File 'c:\windows\winsxs').Count },
  { (cmd /c dir /s /b /a-d 'c:\windows\winsxs').Count },
  { cmd /c 'dir /s /b /a-d c:\windows\winsxs | find /c /v """"' }

On my single-core VMWare Fusion VM with Windows PowerShell v5.1.17134.407 on Microsoft Windows 10 Pro (64-bit; Version 1803, OS Build: 17134.523) I get the following timings, from fastest to slowest (scroll to the right to see the Factor column to show relative performance):

Command                                                                                    Secs (10-run avg.) TimeSpan         Factor
-------                                                                                    ------------------ --------         ------
@([System.IO.Directory]::EnumerateFiles('c:\windows\winsxs', '*', 'AllDirectories')).Count 11.016             00:00:11.0158660 1.00
(cmd /c dir /s /b /a-d 'c:\windows\winsxs').Count                                          15.128             00:00:15.1277635 1.37
cmd /c 'dir /s /b /a-d c:\windows\winsxs | find /c /v """"'                                16.334             00:00:16.3343607 1.48
(Get-ChildItem -Force -Recurse -File 'c:\windows\winsxs').Count                            24.525             00:00:24.5254979 2.23

Interestingly, both [System.IO.Directory]::EnumerateFiles() and the Get-ChildItem solution are significantly faster in PowerShell Core, which runs on top of .NET Core (as of PowerShell Core 6.2.0-preview.4, .NET Core 2.1):

Command                                                                                    Secs (10-run avg.) TimeSpan         Factor
-------                                                                                    ------------------ --------         ------
@([System.IO.Directory]::EnumerateFiles('c:\windows\winsxs', '*', 'AllDirectories')).Count 5.094              00:00:05.0940364 1.00
(cmd /c dir /s /b /a-d 'c:\windows\winsxs').Count                                          12.961             00:00:12.9613440 2.54
cmd /c 'dir /s /b /a-d c:\windows\winsxs | find /c /v """"'                                14.999             00:00:14.9992965 2.94
(Get-ChildItem -Force -Recurse -File 'c:\windows\winsxs').Count                            16.736             00:00:16.7357536 3.29

[1] [System.IO.Directory]::EnumerateFiles() is inherently and undoubtedly faster than a Get-ChildItem solution. In my tests (see section "Performance comparison:" above), [System.IO.Directory]::EnumerateFiles() beat out cmd /c dir /s as well, slightly in Windows PowerShell, and clearly so in PowerShell Core, but others report different findings. That said, finding the overall fastest solution is not the only consideration, especially if more than just counting files is needed and if the enumeration needs to be robust. This answer discusses the tradeoffs of the various solutions.

[2] In fact, due to an inefficient implementation as of Windows PowerShell v5.1 / PowerShell Core 6.2.0-preview.4, use of -Path and -Include is actually slower than using Get-ChildItem unfiltered and instead using an additional pipeline segment with ... | Where-Object Name -like *STB*, as in the OP - see this GitHub issue.

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • 1
    Now THIS is truly the perfect answer and explanation to both this question aswell as [this one](https://stackoverflow.com/questions/54219607/where-to-apply-erroraction-on-a-net-call?noredirect=1#comment95265631_54219607). +1 for sure! – Theo Jan 17 '19 at 08:40
  • The method written as fastest is slower than running the option 3, nearly twice as slow. I love Powershell, but CMD is still preferable at times. – Ben Personick Jan 18 '19 at 20:26
  • @BenPersonick: My tests show the opposite - please see my update. Also note that `dir /s` quietly ignore inaccessible directories, while the .NET-based enumeration throws an exception. – mklement0 Jan 18 '19 at 22:57
  • My testing shows the opposite of yours then I almost posted a big list of results and realised I was probably wasting my time, but there is a definite skew to CMD on my system using both Measure-command and using simpler methods. – Ben Personick Jan 18 '19 at 23:05
  • @BenPersonick: Without looking to antagonize: making unverified claims is a waste of _everyone's_ time. I'd be happy to draw general conclusions from your experience, but with the information given so far that's not possible. – mklement0 Jan 18 '19 at 23:09
  • I am not looking to antagonize either, however right now it is literal he said she said so i'm grabbing my results from testing and following through with hosting here after I'd decided it wasn't worth my time previously. – Ben Personick Jan 18 '19 at 23:11
  • You didn't do the needful, and I figured out what your mistake was, you didn't use the correct command in your testing. – Ben Personick Jan 18 '19 at 23:37
  • IE you are using Powershell to Count the lines, instead of CMD, but the FIND command is a fast counter and the /V discarding blank lines makes it even faster. :) – Ben Personick Jan 19 '19 at 00:12
  • @BenPersonick: I've updated the tests to perform counting, including via `find /c /v ""` - the latter is slightly _slower_ than letting PowerShell do the counting and, overall, the .NET solution is still the fastest. – mklement0 Jan 19 '19 at 19:12
  • How many files are you pushing on this single core VM, and what is the health of the VM? I did most of my testing on physical systems, but also on VMs, and never, ever, am I getting such incredibly LONG times as you have listed in your results. I suspect your esxi infrastructure is having trouble scheduling Numa requests or you're running on some really slow or overloaded disks, those would both be reasons why you might see really long File IOPs like this. But even your fastest time is twice as slow as my longest testing time for either SysIO or MCND. – Ben Personick Jan 24 '19 at 20:50
  • Also, in the majority of tests MCND's example is faster, and when SysIO is faster it has been negligibly so. Whats more the SysIO command bombs out if you run it on C:\Windows or other directories, and this seems to be a real failure as the command is reporting less files in such directories depending on where the command fails when you can get results out of it. – Ben Personick Jan 24 '19 at 22:26
  • @BenPersonick: It's a VMWare Fusion VM running on macOS 10.14, but I don't think we need to delve into this any further - readers can just run the benchmarks themselves and decide what's right for them. As for incomplete enumerations with SysIO: that limitation is clearly stated at the top of the answer. Also, on a general note, `Get-ChildItem` and using enumeration via the `[System.IO.DirectoryInfo]` type (whose `.Enumerate*()` methods parallel those of `[System.IO.Directory]`) offer enumeration of rich objects rather than just path strings, which provides much more versatility. – mklement0 Jan 25 '19 at 02:48
  • @mklement0 That limitation renders this basically of little value in getting proper file counts. If you only want to count files you can do so most quickly using the CMD method even from Powershell, without a problem when reaching links that are intended to be ignored. The Other method returns no results, and just errors. Writing code to work around that would surely only make the method slower. – Ben Personick Jan 29 '19 at 19:34
  • @BenPersonick: Whether that limitation is a problem depends on the use case; stating all limitations allows future readers to choose what's right for them. `dir /s` doesn't ignore _links_ (or junctions), specifically, it _quietly ignores any type of directory / symlink or junction to a directory that it cannot enumerate due to lack of permissions_, which is highly problematic. `Get-ChildItem` alerts you to inaccessible directories _and_ allows you to continue enumeration, but it's slower. If you additionally want to avoid automatic symlink following, you must use PowerShell _Core_. – mklement0 Jan 30 '19 at 03:33
3

One of the fastest ways to do it in cmd command line or batch file could be

dir "x:\some\where\*stb*" /s /b /a-d | find /c /v ""

Just a recursive (/s) dir command to list all files (no folders /a-d) in bare format (/b), with all the output piped to find command that will count (/c) the number of non empty lines (/v "")

But, in any case, you will need to enumerate the files and it requires time.

edited to adapt to comments, BUT

note The approach below does not work for this case because, at least in windows 10, the space padding in the summary lines of the dir command is set to five positions. File counts greater than 99999 are not correctly padded, so the sort /r output is not correct.

As pointed by Ben Personick, the dir command also outputs the number of files and we can retrieve this information:

@echo off
    setlocal enableextensions disabledelayedexpansion

    rem Configure where and what to search
    set "search=x:\some\where\*stb*"

    rem Retrieve the number of files
    set "numFiles=0"
    for /f %%a in ('
        dir "%search%" /s /a-d /w 2^>nul        %= get the list of the files        =%
        ^| findstr /r /c:"^  *[1-9]"            %= retrieve only summary lines      =%
        ^| sort /r 2^>nul                       %= reverse sort, greater line first =%
        ^| cmd /e /v /c"set /p .=&&echo(!.!"    %= retrieve only first line         =%
    ') do set "numFiles=%%a"

    echo File(s) found: %numFiles% 

The basic idea is use a serie of piped commands to handle different parts of data retrieval:

  • Use a dir command to generate the list of files (/w is included just to generate less lines).
  • As we only want summary lines with the number of files found, findstr is used to retrieve only that lines starting with spaces (the header/summary lines) and a number greater than 0 (file count summary lines, as we are using /a-d the directory count summary lines will have a value of 0).
  • Sort the lines in reverse order to end with the greater line first (summary lines start is a left space padded number). Greater line (final file count or equivalent) will be the first line.
  • Retrieve only this line using a set /p command in a separate cmd instance. As the full sequence is wrapped in a for /f and it has a performance problem when retrieving long lists from command execution, we will try to retrieve as little as possible.

The for /f will tokenize the retrieved line, get the first token (number of files) and set the variable used to hold the data (variable has been initialized, it is possible that no file could be found).

MC ND
  • 69,615
  • 8
  • 84
  • 126
  • Came here to say this, nice. Also, just thought i'd share, if you have a single directory you can get the result even faster by not counting, and just looking for the Files output: `dir "x:\some\where\*stb*" | FIND /I "File(s)"` – Ben Personick Jan 16 '19 at 22:30
  • 2
    @BenPersonick, You are right. As you point your approach can not be used for recursive searchs, but will also fail in different locale configurations not using `File(s)` literal. I've updated the answer adding a safer version of your approach. – MC ND Jan 17 '19 at 07:59
  • I couldn't take your version of the command work because Sort overflows main memory on long listing. I did create a version of mine that work on subdirectories, but it is 2-3 times slower than the count method which I had also come up with similarly before, which is what I expected. My hack can really only be taken advantage of on single directories. – Ben Personick Jan 18 '19 at 23:40
  • ANd for what it's worth, the accepted answer is wrong in it's assertions, and I have been arguing with the guy, and just now realised he is testing the wrong command and that's why he's getting the wrong timing results. – Ben Personick Jan 18 '19 at 23:41
  • @BenPersonick, just for curiosity, as the `sort` command is processing only the summary lines, how many folders did you use in your tests? I tested with 150000 and had no problems. – MC ND Jan 19 '19 at 09:57
  • @BenPersonick, testing I have found a problem with the second part of the answer that makes it unusable. – MC ND Jan 19 '19 at 10:49
  • It was `114968 File(s) 8,742,866,124 bytes` From Winsxs, everyone's Winsxs is likely to be different. Anyway, I ran the new version and now it's giving me `File(s) found: 44074` on the same directory.. So something is off, but no more error :) – Ben Personick Jan 20 '19 at 04:15
  • Hey MCND I took the time to figure out what is happening here, and I spent quite a bit of time working on a workaround to the issue. Essentially once the number of files reaches 6 characters in length the results begin to get shifted to the right, and sort will no longer be able to sort them expected. I wet through a bunch of iteration on that and figured out a trick that seems to work in every test scenario I've tried so far. I've change it around too since I prefer to work in CMD directly instead of a batch, but it's working for me. I don;t have timings on speed but it seems similar. – Ben Personick Feb 01 '19 at 22:11
  • Hey there MCND lost what I was writing, but essentially I found a way to make SORT do a true order flip. This seem to be an undocumented feature (of sorts) in that there is nothing to indicate this woudl be the expected behavior, and while initiallY I thought I needed the /R, it's not required as no true sorting is done, the order is simply reversed.``DIR c:\windows\winsxs\* /a-d /s /w 2>NUL | FINDSTR /r /c:"^ *[1-9]" | sort /+99999999999999999 2>nul | cmd /e /v /c"set /p .=&&CALL echo(%.%"`` – Ben Personick Feb 01 '19 at 22:35
  • @MC_ND I always have to say, I really liked your trick for exiting on first value without needing a loop, that is so much more useful as you can just paste it into the cmd prompt and go. In the past I've had to resort to using a cmd script to have a loop that I can exit after the first iteration, but now that isn't necessary. Very cool. Also I determined that the cmd prompt always puts exactly 11 spaces in front of the 1st number position for the files print-out, which is why it's pushing the string larger after it hits 5 characters, so I tweaked that in the findstring – Ben Personick Feb 01 '19 at 22:38
  • @BenPersonick, In my environment (Versión 10.0.17134.523 64b) the `sort /+99999999999999999` does not show any difference in sorting from a simply `sort` (tested for this case and in controlled output scenarios) and the second method included in my answer keeps failing. Maybe it is a bug in your OS version implementation of `sort`, but I can not replicate your findings. – MC ND Feb 02 '19 at 09:43
  • strange works on three Win 2012 R2 systems and a win 7, will try on win2016 and 2k8R2 as well and follow up. – Ben Personick Feb 04 '19 at 04:02
  • Hey MCND, My results remain consistent across several OSs, and when sorting files, regardless of the number of lines, such that when using `DIR | sort /+99999999999` or `sort /+999999999 "C:\Admin\Unsorted_example.txt"` The order is always the exact reverse (IE Line 1 to Line X becomes line X to Line 1). On one system I have run into memory errors running this, but if I manually set it to 4 MB (I arbitrarily chose 4 mb) it's worked just fine. I also find the sort and sort r only return sorted lists while this one scenario returns the reversed list – Ben Personick Feb 04 '19 at 17:20
  • I've run this against multiple discrete systems and the behavior is consistent for every one of them I've tested: `OS: Microsoft Windows Server 2016 Standard` `Version: 10.0.14393 N/A Build 14393` - `OS: Microsoft Windows Server 2012 R2 Standard` `Version: 6.3.9600 N/A Build 9600` - `OS: Microsoft Windows Server 2008 R2 Standard` `Version: 6.1.7601 Service Pack 1 Build 7601` - `OS: Microsoft Windows 7 Professional` `Version: 6.1.7601 Service Pack 1 Build 7601` - `OS: Microsoft(R) Windows(R) Server 2003, Enterprise Edition` `Version: 5.2.3790 Service Pack 2 Build 3790` – Ben Personick Feb 04 '19 at 17:32
  • For clarity that is Win: `2016`, `2012 R2`, `2008 R2`, `7`, and `2003`. If the issue is in Windows 10 you would think it would show up in 2016, but I'll have to check my home Win10 system. – Ben Personick Feb 04 '19 at 18:16
  • @BenPersonick, I have downloaded a official windows 7 professional SP1 x64 ISO from Microsoft, extracted `sort.exe` and tested. I can confirm the behaviour you have exposed can be seen in windows 7 with simply `@(echo 1&echo 2&echo 3)|sort /+99999999999`. Unfortunately (or not) the "bug" is not present at least in windows 10 professional x64 making my second approach unusable. – MC ND Feb 04 '19 at 19:38
  • Hey @MCND I just ran a test in `Win 10 Build 10.0.15063` and the behavior is consistent with the other OSs. I believe the issue is your Windows 10 OS is using a non-english Build of Win 10 or have you Locale setting changed. (Notice how your Version you posted shows `Versión 10.0.17134.523 64b` - There should be no emphasis over the O in a default english locale.) Can you test on an english build or change locale to see if the behavior changes? I'll also find a newer Win 10 Pro/enterprise, probably my laptop at home, to see if it's that particular build you have instead. – Ben Personick Feb 04 '19 at 20:14
  • @BenPersonick, Yes, I have a spanish windows locale. I'm now downloading the latest w10 international english to check. Tested using the `/L C` switch in `sort` to change to binary encoding and there is no difference: correct sort in w10, reverse for w7. For something interesting (probably pointing to the source of the "bug") `@(echo 1&echo 2&echo 3)|sort /+2` outputs `2 1 3` – MC ND Feb 04 '19 at 20:30
  • I think what you've found is just that the CR and LF characters are being included in the sort buffer along with the NULL character. I added 4 and 5, and the 1st line always becomes last with the others in normal order unless the start is set to `/+3` – Ben Personick Feb 04 '19 at 21:24
  • @BenPersonick, I can not change my locale so the test has been done under the same spanish environment. But the `sort.exe` from a international english x64 windows 10 (Win10_1809Oct_EnglishInternational_x64.iso) works as intended correctly sorting the lines. – MC ND Feb 04 '19 at 21:50
  • FWIW< we only have NA copies of Windows 10, so I can't reproduce your issue so far, only finding it working the way I found so far, but we are also using the Long Term Stable Builds of the Enterprise version, and perhaps it's because of the more frequent updates that your ISO has the Sort behavior different. Can you DL an older International build that is closer to one of the LTSB Enterprise builds and see if it is doing the sort the way I find it? If so, then it may be fixed in the next LTSB, if not then it's likely the international English version region isn;t the same as NA English. – Ben Personick Feb 12 '19 at 14:55
2

You can speed up things with PowerShell, if you include the filter directly within the command instead of filtering the result set of the command which is way bigger than a pre filtered one.

Try this:

(Get-ChildItem -Path "Fill_in_path_here" -Recurse -File -Include "*STB*").Count
TobyU
  • 3,718
  • 2
  • 21
  • 32
  • 1
    Thanks Toby, I checked your code and it works as well. I just cant see it speeding up the execution. I stopped the time with both codes. Your option needed 70 seconds longer to get the result. Maybe I did something wrong?! Thanks you anyways :) – InPanic Jan 16 '19 at 13:00
  • 2
    It is only `-Filter` that significantly speeds up the enumeration, due to filtering at the source. Even though you mention `-Filter` in the description, your actual command currently uses `-Include`, which requires enumerating everything before applying the wildcard pattern. – mklement0 Jan 16 '19 at 14:28
  • As @InPanic's experience with `-Include` implies, `Get-ChildItem -Recurse -Include *STB*` is actually _significantly slower_ than `Get-ChildItem -Recurse | Where Name -like *STB*` (on Windows). This is surprising and certainly sounds like it should be fixed - see https://github.com/PowerShell/PowerShell/issues/8662 – mklement0 Jan 16 '19 at 17:05
2

My guess is this is a lot faster:

$path = 'Fill_in_path_here'
@([System.IO.Directory]::EnumerateFiles($path, '*STB*', 'AllDirectories')).Count

If you do not want to recurse subfolders, change 'AllDirectories' to 'TopDirectoryOnly'

Theo
  • 57,719
  • 8
  • 24
  • 41
  • This is slower than using CMD in my testing, About twice as slow, and doesn't work on hidden system folders. For example the CMD method enumerates Winsxs in about 4 seconds, while this method takes about 8 seconds. – Ben Personick Jan 16 '19 at 22:48
  • 1
    @BenPersonick: I see different results when I compare `Measure-Command { [System.IO.Directory]::EnumerateFiles('c:\windows\winsxs', '*', 'AllDirectories') }` to `Measure-Command { cmd /c dir /s /b /a-d 'c:\windows\winsxs' }`: The latter is about 1.3 times _slower_ in my tests - what do you see? Also, `[System.IO.Directory]::EnumerateFiles()` does appear to work on hidden and/or system folders: by default, and invariably; e.g., `[System.IO.Directory]::EnumerateFiles('c:\windows\BitLockerDiscoveryVolumeContents', '*', 'AllDirectories')` works just fine - what makes you think otherwise? – mklement0 Jan 17 '19 at 03:11
  • I am running the CMD directing in CMD. I just have a quick script that echos date time before and after, and I did the same in PS. Measure cmd should be just as effective, except I did run the text in CMD directly so that might be the key. As for dead directories try it in `C:\Windows\System32\` and you'll encounter errors even when running in an administrative ps session – Ben Personick Jan 17 '19 at 20:58
  • Yeah, I don't know what you're smoking I used measure command and it verified my by hand results, the MCND method is nearly twice as fast. Even my own original method I planned to post is faster than your method (by only about 1 second) – Ben Personick Jan 18 '19 at 20:26
  • @mklement0 I thought i was posting to Theo, I only informed yo on the other comment to let you know you were on the wrong track as I thought I was speeking with OP here having not looked closely. – Ben Personick Jan 18 '19 at 23:06
  • 1
    @BenPersonick What is it about my answer and the comment by mklement0 that makes you so angry? This is Stack Overflow, where people ask questions and people give answers. If you have an answer of your own, great then **POST IT** instead of commenting 13 times about it before you finally do so. Let the OP and others decide for themselves which approach they prefer. Peace man! – Theo Jan 19 '19 at 10:47
  • I think you're confused, I only posted to you once, and then MK twice. Are you the same person XD. Lol. Anyway, you also seem to have interpreted making some statement about the facts as angry, I'm sad that 's your interpretation, but not at all the case. :) Peace man! – Ben Personick Jan 20 '19 at 04:04
  • 1
    @BenPersonick I'm not the one that is confused.. This is already your 14th comment to the question and yes, your tone is not always pleasant: _Came here to say this, nice_, _I have been arguing with the guy_, _Yeah, I don't know what you're smoking_. And no, we are not the same person. By the way, I hate to burst your bubble, but I also did the test on my Windows 7 Pro machine (PowerShell version 5.1) and guess what: `[System.IO.Directory]::EnumerateFiles` came out fastest every time.. Seems different machines give different resuts then. – Theo Jan 20 '19 at 11:12
  • Hey @Theo, not on your comment, or to you, as I said, and the comments are not all about this specific issue. Regardless I don't feel any bubble bursting, as why should I? My issue has been with MK's jumping in to re-hash the other posts here and proclaim a winner on speed seemingly based on opinion (Although he later amended they are still showing values which don't make sense they are too disparate.). I still stand by this contention, I've run a couple thousand tests (1k with my test method, and 1k with MK's prefered method) Across a bunch of systems over the weekend, and will post them. – Ben Personick Jan 23 '19 at 20:55
  • Once I have the time to sweep through them and collect the results, but I've been busy with actual work. Currently I had only spot-checked a few and only the Desktop OS (Win 7 64 on a laptop), and even there it's only by a small factor. I also found that sometimes MK's prefered method was giving what appears to be wildly inaccurate results, given the method I use allows review of all data collected, and his does not I don;t know what the cause is there, but the .Count method is showing up as 16 seconds on windows 7 OS using his method over 1000 runs, while the others are less than 4 seconds. – Ben Personick Jan 23 '19 at 21:02
1

I would rather do PowerShell as it is a far stronger tool. You might give this .bat file script a try.

@ECHO OFF
SETLOCAL ENABLEDELAYEDEXPANSION

SET /A "N=0"
FOR /F "delims=" %%f IN ('DIR /S /B /A:-D "C:\path\to\files\*STB*"') DO (SET /A "N=!N!+1")
ECHO There are !N! files.
lit
  • 14,456
  • 10
  • 65
  • 119
  • If you think that, then why did you use my answer in your [next question](https://stackoverflow.com/questions/54219607/where-to-apply-erroraction-on-a-net-call?noredirect=1#comment95265631_54219607)? – Theo Jan 16 '19 at 16:03
  • 2
    @Theo - Your answer works well. However, it fails if the current account does not have permission in a directory. I thought that exception handling was different enough that it should be another question. Many times comments about a different question suggest `create a new question`. – lit Jan 16 '19 at 18:29
  • No it doesn't fail. It throws an exception if you do not have permission on a directory and because of that you can not do a file count. You can catch and ignore this, but then... What is the value of the count worth. It certainly won't be accurate. – Theo Jan 16 '19 at 19:11
  • @Theo: Opting to ignore permission problems while wanting to process all files that _are_ accessible is a legitimate use case; your answer is by far the fastest (+1), but it cannot handle this use case. – mklement0 Jan 16 '19 at 21:24
-1

General testing favors MCND's command when run against several systems

Results over 1000 Runs:

Summary


P/VM -          OS - PS Ver -  Files - Winner - Faster By % Seconds - Winner FPS - Loser FPS (Files Per Second)
---- - ----------- - ------ - ------ - ------ - ------------------- - ---------- - ----------------------------
 PM  - Win    7    - 5.1.1  -  87894 - SysIO  - 9%  (0.29s)         - 27,237 FPS - 24,970 FPS
 PM  - Win 2012    - 5.1.1  - 114968 - MCND   - 8%  (0.38s)         - 25,142 FPS - 23,226 FPS
 VM  - Win 2012    - 5.1.1  -  99312 - MCND   - 34% (1.57s)         - 21,265 FPS - 15,890 FPS
 PM  - Win 2016    - 5.1.1  - 102812 - SysIO  - 2%  (0.12s)         - 20,142 FPS - 19,658 FPS
 VM  - Win 2012 R2 -  4.0   -  98396 - MCND   - 29-34% (1.56-1.71s) - 19,787 FPS - 14,717 FPS
 PM  - Win 2008 R2 - 5.0.1  -  46557 - MCND   - 13-17% (0.33-0.44s) - 18,926 FPS - 16,088 FPS
 VM  - Win 2012 R2 -  4.0   -  90906 - MCND   - 22% (1.25s)         - 16,772 FPS - 13,629 FPS

Additionally Theos command will bomb on C:\Windows while MCND's works as expected.

-- I have explained to MK in the comments that the \cookies directory and other such directories are intentionally non-traversable so that you od not double-count the files contained within them.

The test MK ran on a VMWare fusion running atop his MAC OS is far from conclusive, and shows execution times are incredibly slow which immediately tipped me off that they were strange results.

In addition, I could not execute the command as written by MK and receive a result of the number of files in the folder, so I have included a snippet in my testing which shows all methods used do give the correct result.

Here is the code used for my runs, note I also ran 1000 runs using MK's preferred method to compare output.

Strangely that one .count method for the MCND command seems to give very biased results on my win7 system, very different from any other system and hugely slower (5x slower) in the initial runs I posted, and varying the most on future runs I tried.

But I think this is due to load, and plan to drop those results if I ever bother to post more, but most of the remaining systems are pretty similar to the results I feel like they could seem redundant if they aren't from very different systems.

$MaxRuns=1000
$Root_Dir="c:\windows\winsxs"
$Results_SysIO=@()
$Results_MCND1=@()
$Results_MCND2=@()
$Results_MCND3=@()
$Results_Meta=@()

FOR ($j=1; $j -le $MaxRuns; $j++) {

      Write-Progress -Activity "Testing Mthods for $MaxRuns Runs" -Status "Progress: $($j/$MaxRuns*100)% -- Run $j of $MaxRuns" -PercentComplete ($j/$MaxRuns*100) 

    # Tests  SysIO: @([System.IO.Directory]::EnumerateFiles($Root_Dir, '*', 'AllDirectories')).Count
      $Results_SysIO+=Measure-Command { @([System.IO.Directory]::EnumerateFiles($Root_Dir, '*', 'AllDirectories')).Count }
      sleep -milliseconds 500

    # Tests  MCND1 CMD Script:  DIR "%~1" /s /a-d ^| FIND /I /V "" | find /c /v ""
      $Results_MCND1+=Measure-Command {C:\Admin\TestMCNDFindFiles1.cmd $Root_Dir}
      sleep -milliseconds 500

     # Tests MCND2 CMD Count: {cmd /c 'dir /s /b /a-d $Root_Dir | find /c /v """"'}
      $Results_MCND2+=Measure-Command {cmd /c `"dir /s /b /a-d $Root_Dir `| find /c /v `"`"`"`"`"}
      sleep -milliseconds 500
     
     # Tests MCND3 PS Count (cmd /c dir /s /b /a-d $Root_Dir).Count
      $Results_MCND3+=Measure-Command {(cmd /c dir /s /b /a-d $Root_Dir).Count}
      sleep -milliseconds 500


}

$CPU=Get-WmiObject Win32_Processor
""
"CPU: $($($CPU.name).count)x $($CPU.name | select -first 1) - Windows: $($(Get-WmiObject Win32_OperatingSystem).Version) - PS Version: $($PSVersionTable.PSVersion)"
ForEach ($Name in "SysIO","MCND1","MCND2","MCND3") {
    $Results_Meta+=[PSCustomObject]@{
      Method=$Name
      Min=$($($(Get-Variable -Name "Results_$Name" -valueOnly).TotalSeconds|Measure-Object -Minimum).Minimum)
      Max=$($($(Get-Variable -Name "Results_$Name" -valueOnly).TotalSeconds|Measure-Object -Maximum).Maximum)
      Avg=$($($(Get-Variable -Name "Results_$Name" -valueOnly).TotalSeconds|Measure-Object -Average).Average)
    }
}

$Results_Meta | sort Avg | select Method,Min,Max,Avg,@{N="Factor";e={("{0:f2}" -f (([math]::Round($_.Avg / $($Results_Meta | sort Avg | select Avg)[0].avg,2,1))))}}|FT

Time-Command `
{cmd /c `"dir /s /b /a-d $Root_Dir `| find /c /v `"`"`"`"`"},
{C:\Admin\TestMCNDFindFiles1.cmd $Root_Dir},
{@([System.IO.Directory]::EnumerateFiles($Root_Dir, '*', 'AllDirectories')).Count},
{(cmd /c dir /s /b /a-d $Root_Dir).Count} $MaxRuns `

""
"Results of Commands - (How many Files were in that Folder?):"

[PSCustomObject]@{
    SysIO=$(&{ @([System.IO.Directory]::EnumerateFiles($Root_Dir, '*', 'AllDirectories')).Count })
    MCND1=$(&{C:\Admin\TestMCNDFindFiles1.cmd $Root_Dir})
    MCND2=$(&{cmd /c `"dir /s /b /a-d $Root_Dir `| find /c /v `"`"`"`"`"})
    MCND3=$(&{(cmd /c dir /s /b /a-d $Root_Dir).Count})
}

I have Additional Runs I didn't collect yet from additional systems, the Win7 Results are inconsistent though so I'll probably strip them when I have more to add to the list from other systems.

Detailed Findings


Physical Win 7 Laptop - 87894 Files - Loser: MCND is 9% (.29s) Slower - (Winning Method: 27,237 FPS) -- Results are not consistent on re-runs while other systems are.

CPU: 1x Intel(R) Core(TM) i5-4310U CPU @ 2.00GHz - Windows: 6.1.7601 - PS Version: 5.1.14409.1012


CPU: 1x Intel(R) Core(TM) i5-4310U CPU @ 2.00GHz - Windows: 6.1.7601 - PS Version: 5.1.14409.1012

Method       Min       Max          Avg Factor
------       ---       ---          --- ------
SysIO  3.0190345 6.1287085 3.2174689013 1.00  
MCND1  3.3655209 5.9024033 3.5490564665 1.10  
MCND3  3.5865989 7.5816207 3.8515160528 1.20  
MCND2  3.7542295 7.5619913 3.9471552743 1.23  
3.2174689013
0.0000366062
Command                                                                          Secs (1000-run avg.) TimeSpan         Factor
-------                                                                          -------------------- --------         ------
@([System.IO.Directory]::EnumerateFiles($Root_Dir, '*', 'AllDirectories')).Count 3.227                00:00:03.2271969 1.00  
C:\Admin\TestMCNDFindFiles1.cmd $Root_Dir                                        3.518                00:00:03.5178810 1.09  
cmd /c `"dir /s /b /a-d $Root_Dir `| find /c /v `"`"`"`"`"                       3.911                00:00:03.9106284 1.21  
(cmd /c dir /s /b /a-d $Root_Dir).Count                                          16.338               00:00:16.3377823 5.06  

Results of Commands - (How many Files were in that Folder?):

SysIO MCND1 MCND2 MCND3
----- ----- ----- -----
87894 87894 87894 87894

Physical Win 2012 Desktop - 114968 Files - Loser: SysIO is 8% (.38s) Slower - (Winning Method: 25,142 FPS)

CPU: 1x Intel(R) Xeon(R) CPU E5-2407 0 @ 2.20GHz - Windows: 6.3.9600 - PS Version: 5.1.14409.1012


CPU: 1x Intel(R) Xeon(R) CPU E5-2407 0 @ 2.20GHz - Windows: 6.3.9600 - PS Version: 5.1.14409.1012

Method       Min        Max          Avg Factor
------       ---        ---          --- ------
MCND1  4.4957173  8.6672112 4.5726616326 1.00  
MCND3  4.6815509 18.6689706 4.7940769407 1.05  
SysIO  4.8789948  5.1625618 4.9476786004 1.08  
MCND2  5.0404912  7.2557797 5.0854683543 1.11  

Command                                                                          Secs (1000-run avg.) TimeSpan         Factor
-------                                                                          -------------------- --------         ------
C:\Admin\TestMCNDFindFiles1.cmd $Root_Dir                                        4.542                00:00:04.5418653 1.00  
(cmd /c dir /s /b /a-d $Root_Dir).Count                                          4.772                00:00:04.7719769 1.05  
@([System.IO.Directory]::EnumerateFiles($Root_Dir, '*', 'AllDirectories')).Count 4.933                00:00:04.9330404 1.09  
cmd /c `"dir /s /b /a-d $Root_Dir `| find /c /v `"`"`"`"`"                       5.086                00:00:05.0855891 1.12  

Results of Commands - (How many Files were in that Folder?):

 SysIO MCND1  MCND2   MCND3
 ----- -----  -----   -----
114968 114968 114968 114968

VM Win 2012 Server - 99312 Files - Loser: SysIO is 34% (1.57s) Slower - (Winning Method: 21,265 FPS)

CPU: 4x Intel(R) Xeon(R) CPU E7- 2850 @ 2.00GHz - Windows: 6.3.9600 - PS Version: 5.1.14409.1005


CPU: 4x Intel(R) Xeon(R) CPU E7- 2850  @ 2.00GHz - Windows: 6.3.9600 - PS Version: 5.1.14409.1005

Method       Min       Max              Avg Factor
------       ---       ---              --- ------
MCND1  4.5563908 5.2656374     4.6812307177 1.00  
MCND3  4.6696518 5.3846231     4.9064852835 1.05  
MCND2  5.0559205 5.5583717 5.15425442679999 1.10  
SysIO   6.036294 6.7952711      6.254027334 1.34  

Command                                                                          Secs (1000-run avg.) TimeSpan         Factor
-------                                                                          -------------------- --------         ------
C:\Admin\TestMCNDFindFiles1.cmd $Root_Dir                                        4.669                00:00:04.6689048 1.00  
(cmd /c dir /s /b /a-d $Root_Dir).Count                                          4.934                00:00:04.9336925 1.06  
cmd /c `"dir /s /b /a-d $Root_Dir `| find /c /v `"`"`"`"`"                       5.153                00:00:05.1532386 1.10  
@([System.IO.Directory]::EnumerateFiles($Root_Dir, '*', 'AllDirectories')).Count 6.239                00:00:06.2389727 1.34  

Results of Commands - (How many Files were in that Folder?):

SysIO MCND1 MCND2 MCND3
----- ----- ----- -----
99312 99312 99312 99312

Physical Win 2016 Server - 102812 Files - Loser: MCND is 2% (0.12s) Slower - (Winning Method: 20,142 FPS)

CPU: 2x Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz - Windows: 10.0.14393 - PS Version: 5.1.14393.2608


CPU: 2x Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz - Windows: 10.0.14393 - PS Version: 5.1.14393.2608

Method       Min       Max              Avg Factor
------       ---       ---              --- ------
SysIO  5.0414178 5.5279055     5.1043614001 1.00  
MCND3  5.0468476 5.4673033 5.23160342460001 1.02  
MCND1  5.1649438 5.6745749 5.26664923669999 1.03  
MCND2  5.3280266 5.7989287     5.3747728434 1.05  

Command                                                                          Secs (1000-run avg.) TimeSpan         Factor
-------                                                                          -------------------- --------         ------
@([System.IO.Directory]::EnumerateFiles($Root_Dir, '*', 'AllDirectories')).Count 5.156                00:00:05.1559628 1.00  
(cmd /c dir /s /b /a-d $Root_Dir).Count                                          5.256                00:00:05.2556244 1.02  
C:\Admin\TestMCNDFindFiles1.cmd $Root_Dir                                        5.272                00:00:05.2722298 1.02  
cmd /c `"dir /s /b /a-d $Root_Dir `| find /c /v `"`"`"`"`"                       5.375                00:00:05.3747287 1.04  

Results of Commands - (How many Files were in that Folder?):

 SysIO MCND1  MCND2   MCND3
 ----- -----  -----   -----
102812 102812 102812 102812

VM Win 2012 R2 Server - 98396 Files - Loser: SysIO 29-34% (1.56-1.71s) Slower - (Winning Method: 19,787 FPS)

CPU: 2x Intel(R) Xeon(R) CPU E7- 2850 @ 2.00GHz - Windows: 6.3.9600 - PS Version: 4.0


CPU: 2x Intel(R) Xeon(R) CPU E7- 2850  @ 2.00GHz - Windows: 6.3.9600 - PS Version: 4.0


Method                                                        Min                             Max                             Avg Factor                         
------                                                        ---                             ---                             --- ------                         
MCND1                                                   4.7007419                       5.9567352                4.97285509330001 1.00                           
MCND2                                                   5.2086999                       6.7678172                    5.4849721167 1.10                           
MCND3                                                   5.0116501                       8.7416729                5.71391797679999 1.15                           
SysIO                                                   6.2400687                        7.414201                    6.6862204345 1.34 

Command                                  Secs (1000-run avg.)                     TimeSpan                                Factor                                 
-------                                  --------------------                     --------                                ------                                 
C:\Admin\TestMCNDFindFiles1.cmd $Root... 5.359                                    00:00:05.3592304                        1.00                                   
cmd /c `"dir /s /b /a-d $Root_Dir `| ... 5.711                                    00:00:05.7107644                        1.07                                   
(cmd /c dir /s /b /a-d $Root_Dir).Count  6.173                                    00:00:06.1728413                        1.15                                   
@([System.IO.Directory]::EnumerateFil... 6.921                                    00:00:06.9213833                        1.29                                   

Results of Commands - (How many Files were in that Folder?):

                                   SysIO MCND1                                    MCND2                                                                     MCND3
                                    ----- -----                                    -----                                                                     -----
                                   98396 98396                                    98396                                                                     98396

Physical Win 2008 R2 Server - 46557 Files - Loser: SysIO 13-17% (0.33-0.44s) Slower - (Winning Method: 18,926 FPS)

CPU: 2x Intel(R) Xeon(R) CPU 5160 @ 3.00GHz - Windows: 6.1.7601 - PS Version: 5.0.10586.117


CPU: 2x Intel(R) Xeon(R) CPU            5160  @ 3.00GHz - Windows: 6.1.7601 - PS Version: 5.0.10586.117

Method       Min       Max          Avg Factor
------       ---       ---          --- ------
MCND3  2.2370018 2.8176253 2.4653543378 1.00  
MCND1  2.4063578 2.8108379 2.5373719772 1.03  
MCND2  2.5953631 2.9085969 2.7312907064 1.11  
SysIO  2.7207865 30.335369 2.8940406601 1.17  

Command                                                                          Secs (1000-run avg.) TimeSpan         Factor
-------                                                                          -------------------- --------         ------
(cmd /c dir /s /b /a-d $Root_Dir).Count                                          2.500                00:00:02.5001477 1.00  
C:\Admin\TestMCNDFindFiles1.cmd $Root_Dir                                        2.528                00:00:02.5275259 1.01  
cmd /c `"dir /s /b /a-d $Root_Dir `| find /c /v `"`"`"`"`"                       2.726                00:00:02.7259539 1.09  
@([System.IO.Directory]::EnumerateFiles($Root_Dir, '*', 'AllDirectories')).Count 2.826                00:00:02.8259697 1.13  

Results of Commands - (How many Files were in that Folder?):

SysIO MCND1 MCND2 MCND3
----- ----- ----- -----
46557 46557 46557 46557

VMWare Win 2012 R2 Server - 90906 Files - Loser: SysIO 23% (1.25s) Slower - (Winning Method: 15,722 FPS)

CPU: 4x Intel(R) Xeon(R) CPU E7- 2850 @ 2.00GHz - Windows: 6.3.9600 - PS Version: 4.0


CPU: 4x Intel(R) Xeon(R) CPU E7- 2850  @ 2.00GHz - Windows: 6.3.9600 - PS Version: 4.0

Method                                                        Min                             Max                             Avg Factor                         
------                                                        ---                             ---                             --- ------                         
MCND1                                                   5.0516057                       6.4537866                     5.423386317 1.00                           
MCND3                                                   5.3297157                       7.1722929                    5.9030135773 1.09                           
MCND2                                                   5.5460548                       7.0356455                     5.931334868 1.09                           
SysIO                                                   6.2059999                      19.5145373                    6.6747122712 1.23                           

Command                                  Secs (1000-run avg.)                     TimeSpan                                Factor                                 
-------                                  --------------------                     --------                                ------                                 
C:\Admin\TestMCNDFindFiles1.cmd $Root... 5.409                                    00:00:05.4092046                        1.00                                   
(cmd /c dir /s /b /a-d $Root_Dir).Count  5.936                                    00:00:05.9358832                        1.10                                   
cmd /c `"dir /s /b /a-d $Root_Dir `| ... 6.069                                    00:00:06.0689899                        1.12                                   
@([System.IO.Directory]::EnumerateFil... 6.557                                    00:00:06.5571859                        1.21                                   


Results of Commands - (How many Files were in that Folder?):

                                   SysIO MCND1                                    MCND2                                                                     MCND3
                                    ----- -----                                    -----                                                                     -----
                                   90906 90906                                    90906                                                                     90906

Community
  • 1
  • 1
Ben Personick
  • 3,074
  • 1
  • 22
  • 29
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/187724/discussion-on-answer-by-ben-personick-efficiently-counting-files-in-directory-an). – Samuel Liew Feb 01 '19 at 09:48