0

I know there are duplicates out there, but I have tried about 10 methods and it takes between 6ms and 29 ms, with an average of about 10 ms for each folder to parse out the name of the file from the path returned in Directory.GetDirectories().

The fastest I have found is System.IO.Path.GetFileName(fullPath), but only by a short margin of string.SubString(string.LastIndexOf(@"\");

I am writing a better version of Windows Explorer, but it takes about half a second to expand the C:\ and show all sub folders. Half a second may not seem like much, but Windows, and another program I looked at do it what appears instantly.

The only thing I can think of is to index the file and store the index as XML, I guess that can be done at startup or something.

I am just curious how Windows and another control can do it so much faster on the same PC. Is C# and managed code slower than unmanaged C++?

Update:

Here is a sample of the log entry, I realize using File.AppendAllText has to open and close the file, but many of the operations only take 1 milisecond to accomplish this, so the only slow time is the SetDirectoryName, which is done like:

public void LoadChildNodes()
        {
            // clear the child nodes
            this.ChildPanel.Controls.Clear();

            // if this path exists
            if (Directory.Exists(this.Path))
            {
                // get the log
                WriteLogEntry("After Path Exists: " + stopWatch.Elapsed.Milliseconds.ToString());

                // get the directories for this node
                string[] tempDirectories = Directory.GetDirectories(this.Path);

                // get the log
                WriteLogEntry("After GetDirectories: " + stopWatch.Elapsed.Milliseconds.ToString());

                // if there rae one or more directories
                if ((tempDirectories != null) && (tempDirectories.Length > 0))
                {  
                    // reverse the list
                    List<string> directories = new List<string>();

                    // iterate the strings
                    foreach (string tempDirectory in tempDirectories)
                    {
                        // add this item
                        directories.Add(tempDirectory);
                    }

                    // now set the directories
                    directories.Reverse();

                    // log the time
                    WriteLogEntry("After Reverse Directories: " + stopWatch.Elapsed.Milliseconds.ToString());

                    // iterate the directory names
                    foreach (string directory in directories)
                    {
                        // log the time
                        WriteLogEntry("After Start Iterate New Directory: " + stopWatch.Elapsed.Milliseconds.ToString());

                        // create the childNode
                        ExplorerTreeNode childNode = new ExplorerTreeNode();

                        // the path for folders is the same as the name
                        string directoryName = System.IO.Path.GetFileName(directory);

                        // log the time
                        WriteLogEntry("After set directory name: " + stopWatch.Elapsed.Milliseconds.ToString());

                        // setup the node
                        childNode.SetupNode(directoryName, NodeTypeEnum.Folder, this.IconManager, this.Font, path);

                        // log the time
                        WriteLogEntry("After Setup Node" + stopWatch.Elapsed.Milliseconds.ToString());

                        // add this node
                        this.ChildPanel.Controls.Add(childNode);

                        // log the time
                        WriteLogEntry("After Add childNode to Controls: " + stopWatch.Elapsed.Milliseconds.ToString());

                        // dock to top
                        childNode.Dock = DockStyle.Top;

                        // log the time
                        WriteLogEntry("After Dock: " + stopWatch.Elapsed.Milliseconds.ToString());
                    }

                    // finished loading child nodes
                    stopWatch.Stop();
                    WriteLogEntry("Finished loading child nodes: " + stopWatch.Elapsed.Milliseconds.ToString());
                }                 
            }
        }

I was trying to avoid buying a control so I could make the project open source, but I guess I will just buy it and only give the executable away.

After Path Exists: 1 After GetDirectories: 2 After Reverse Directories: 3 After Start Iterate New Directory: 3 After set directory name: 20 After Setup Node21 After Add childNode to Controls: 21 After Dock: 22 After Start Iterate New Directory: 22 After set directory name: 29 After Setup Node29 After Add childNode to Controls: 30 After Dock: 30 After Start Iterate New Directory: 30 After set directory name: 37 After Setup Node38 After Add childNode to Controls: 38 After Dock: 39 After Start Iterate New Directory: 39

Corby Nichols
  • 17
  • 1
  • 3
  • It is likely they use low level code to retrieve this stuff. There maybe some COM libraries you can tap into - but ultimately I reckon Microsoft will have got the explorer pretty well nailed down... – MoonKnight Nov 13 '12 at 16:01
  • "Is C# and managed code slower than unmanaged C++?" Yes, very much so. Other people have put it better, and the best answer I found (after a really quick Google for specific information) was http://stackoverflow.com/questions/4257659/c-sharp-versus-c-performance – Jamie Taylor Nov 13 '12 at 16:02
  • 1
    You might want background-workers to build a cache of what's in store one level down. – Johan Larsson Nov 13 '12 at 16:03
  • 1
    Wait, by "short margin" are you saying that `string.SubString` takes `10 ms`? Or the whole operation? Can you post your entire method? Also, why not use [EQUATEC profiler](http://eqatec.com/Profiler/) or something similat (that one's free, at least for now) and make sure you are not tweaking the wrong thing? I seriously doubt that this method is that slow: I would bet you are doing some additional work for each parsed file. – vgru Nov 13 '12 at 16:08
  • Use [`Directory.EnumerateDirectories`](http://msdn.microsoft.com/en-us/library/dd383304.aspx) (with `GetDirectories` you must wait for the whole array of names to be returned before you can access the array) and do this operation in a backround [`Task`](http://msdn.microsoft.com/en-us/library/system.threading.tasks.task.aspx). – Paolo Moretti Nov 13 '12 at 16:15
  • @PaoloMoretti: I believe OP is more concerned with the total running time, which would be roughly the same in both cases. – vgru Nov 13 '12 at 16:34
  • @Groo Yes, you are right, but sometimes speed is related to responsiveness. For example, if you open a folder with a lot of files in Windows Explorer, it's going to take time, but you can see the first elements immediately. – Paolo Moretti Nov 13 '12 at 17:02

2 Answers2

2

First off, Windows Explorer implements direct access to the FAT table. This provides a fair amount of speed.

Second, it uses a combination of a cache and hooks into change notifications. This allows it to know when other applications/windows create files and directories.

Because it starts on boot, it's able to get all of this information up front which makes it appear that everything runs near instantly.

You'll notice that is slows down when accessing network drives. This is because it only retrieves one level at a time, while caching the results and refreshing on additional access.

Finally, I think you have something else going on though. 10ms to parse a single file name is a bit extreme.

NotMe
  • 87,343
  • 27
  • 171
  • 245
  • This seems like a good answer but doesn't answer the last question: "Is C# and managed code slower than unmanaged C++?" which is YES – emartel Nov 13 '12 at 16:15
  • 2
    @emartel: actually it's more complicated than a simple yes/no. See http://stackoverflow.com/questions/3016451/performance-of-managed-c-vs-unmanaged-native-c – NotMe Nov 13 '12 at 16:17
  • fair enough! My limited experience to managed code seemed to point towards it being slower but if it's not the case, this link should be part of the answer :) – emartel Nov 13 '12 at 16:19
2

You access times are strange, I doubt that your real bottleneck is there. For example, I just ran a test app and got this (note: I have an SSD, so my test pretty much removes the influence of disk access speed):

Finding all files inside E:\ (recursive)...
Directory.GetFiles found 91731 files in 10,600 ms: 115.6 microseconds/file
Path.GetFileName parsed 91731 files in 134 ms: 1.5 microseconds/file

That's microseconds. And most of the time is spent fetching the files into the array, parsing the file name is trivial after that.

The bottom line is: I would recommend that you download a profiler (like EQUATEC), and check where your time is being spent.

Here's my code if you'd like to try it yourself:

class Program
{
    static void Main(string[] args)
    {
        var stopwatch = new Stopwatch();
        var path = @"E:\";

        Console.WriteLine("Finding all files inside {0} (recursive)", path);

        stopwatch.Restart();
        var allFiles = Directory.GetFiles(path, "*.*", SearchOption.AllDirectories);
        stopwatch.Stop();
        Output("Directory.GetFiles found", allFiles.Length, stopwatch.ElapsedMilliseconds);

        stopwatch.Restart();
        var filenames = allFiles.Select(Path.GetFileName).ToArray();
        stopwatch.Stop();
        Output("Path.GetFileName parsed", filenames.Length, stopwatch.ElapsedMilliseconds);

        Console.Read();
    }

    private static void Output(string action, int len, long timeMs)
    {
        Console.WriteLine("{0} {1} files in {2:#,##0} ms: {3:0.0} microseconds/file", action, len, timeMs, timeMs * 1000.0 / len);
    }
}
vgru
  • 49,838
  • 16
  • 120
  • 201