0

I'm writing a little app to pull down a few valid samples of each particular type, from a much larger pile of samples.

The structure looks like:

ROOT->STATE->TYPE->SAMPLE

My program cruises through the states, and grabs each unique type, and the path to that type. Once all those are obtained, it goes through each type, and selects X random samples, with X supplied by the user.

The program works great locally, but over the network it's obiviously much slower. I've taken measures to help this, but the last part I'm hung up on is getting the random sample from the TYPE directory fast.

Locally, I use

    List<String> directories = Directory.GetDirectories(kvp.Value).ToList();

Which is the bottleneck when running this over the network. I have a feeling this may not be possible, but is there a way to grab, say, 5 random samples from the TYPE directory without first identifying all the samples?

Hopefully I have been clear enough, thankyou.

Jonesopolis
  • 25,034
  • 12
  • 68
  • 112

2 Answers2

0

Perhaps try using DirectoryInfo, when making lots of calls to a specific directory it's faster as security not checked on every access.

3dd
  • 2,520
  • 13
  • 20
0

You may find speed increases from using a DirectoryInfo object for the root and the sub-folders you want and listing directories that way. That will get you minor speed increases as .NET's lazy initialisation strategy means it will take more network roundtrips using the static Directory methods that you employ in your sample.

The next question I suppose is why is speed important? Have you considered doing something like maintaining an uptodate index in a cache of your own design for speedy access? Either using a FileSystemWatcher, a regular poll, or both?

I think you may also be interested in this link: Checking if folder has files

... it contains some information about limiting your network calls to the bare minimum by retrieving information about the entire directory structure from one call. This will no doubt increase your memory requirements however.

Is the name of each kind of file predictable? Would you have better luck randomly predicting some sample names and reading them directly?

Community
  • 1
  • 1
Henri Cook
  • 104
  • 6