4

This question provides a fast way to use the kernel.dll to recursively find file attributes, e.g. file names. The problem is reporting progress (such as in a Windows Forms App) is limited to which file or directory it is currently on as it has no information as to total file count upfront.

Though, I know in Windows 7 if you use the file explorer to search for a file, it shows a progress bar for the search:

enter image description here

So how is it they do it here? Is total file count known here ahead of time? Is it possible to mimic this kind of progress reporting in the answer from the question linked above? I'm not sure how to do it without a total file count upfront.

The closest question I could find was this one which seems to have some problems with this recursion method as I don't have folder count upfront, and the behavior would be very odd for a single directory of many files.

Community
  • 1
  • 1
user17753
  • 3,083
  • 9
  • 35
  • 73
  • 1
    Because they're not limited to the .net framework and it's constructs. You can do the same thing if you want to delve down into the Windows API's and get your hands dirty. – CodingGorilla Aug 20 '12 at 17:28
  • Considering how terrible and useless the Windows 7 file search is, I would not try to emulate it. It has told me many time it cannot find a file when I am looking right at it. You should be able to do a very fast recursive Method search to find the total file count. – Belmiris Aug 20 '12 at 17:34
  • @Belmiris Well even the fastest using kernel interop took a couple minutes on 57k files on a network share. So I want to report progress on finding the total file count, but I cannot without knowing it; hence the question. Also, obvi they are not limited to .NET -- what method are they using? A magical infinite bar? Or do they use some kind of indexing, or what? – user17753 Aug 20 '12 at 17:40
  • 1
    @user17753: through observation of that particular bar's behaviour you can conclude that it is indeed an infinite bar. As new directories are scanned it increases the bar's maximum value (there also appears to be some kind of time element involved) and when the bar reaches near the end it just stays there until the search is actually complete. I have observed that the bar reflects very little of reality. – Sam Axe Aug 20 '12 at 18:02
  • As Belmiris said, the bar is pretty awful. If you want something more usable, take a look at the file copy/move progress dialog. It does some precalculation of count and size and provides an infinitely better UX. By doing something similar, but with less depth, you can get pretty close (check out my answer). – ssube Aug 20 '12 at 18:40

2 Answers2

5

Depending on how accurate you need to get, there may be a simple two-pass solution (not optimal for network drives, so you may need to tune there).

For the first n levels of directories (say 4, including drive), count the number of sub-directories. This is typically a quick operation, although you can tweak it to only recurse when more than say 5 subdirectories are present or similar. Store this number.

As you perform the real search, keep track of the number of subdirectories you've completed that are within n steps of the root. Use this number and the stored count to estimate completion.

For example, with the basic structure:

C:\
    a\
        1\
            i\
            ii\
            iii\
        2\
        3\
    b\
    c\

Count a, 1, ignore i and siblings, count 2, etc. Then, while searching, increase the bar when you finish searching 3, 2, 1, a, etc.

Now, this is absolutely not fool-proof. There may be race conditions, it's not terribly accurate, and all sorts of other things.

However, for a low-granularity progress bar, it's close enough that it will appear pretty accurate. More importantly from a user-experience perspective, using a stored count and comparing progress against that tends to prevent the bar from growing halfway through the process.

I'm actually using this technique in some code here.

The initial build, which goes down 10 levels, is still pretty speedy. I don't remember just how much testing went into it, but the bar is visibly accurate without many pauses when searching through 2.5-3 million files (despite only checking 1/1000th of that ahead of time). Note that the shorter your progress bar, the more accurate it will appear. ;)

Joey Dumont
  • 898
  • 2
  • 7
  • 25
ssube
  • 47,010
  • 7
  • 103
  • 140
1

Start a thread in the background that counts the files under the directory you are searching. In the background, update the number of files counted so far. Use this count in your progress bar. Once this count is greater than one, it is safe to start the search. When computing progress, fudge the result if your search (miraculously) outpaces your background file counter, or if an occaional visible digression in the progress bar is unacceptable.

This then leaves figuring out the fastest way to count files (consider FindFirstFileEx). On a local drive, it might might take 2 or 3 seconds to count a folder like Program Files. Over the network, it will take a lot longer because with FindFirstFileEx, file names are transmitted when you only want the count of files and a list of directories.

This all presumes you will spend a lot more time in the search than in just counting files.

Les
  • 10,335
  • 4
  • 40
  • 60