I'm looking for a way to randomly select a file from a tree of directories in a manner such that any individual file has exactly the same probability of being chosen as all other files. For example in the following tree of files, each file should have a 25% chance of being chosen:
- /some/parent/dir/
- Foo.jpg
- sub_dir/
- Bar.jpg
- Baz.jpg
- another_sub/
- qux.png
My interim solution which I'm using while I code the rest of the app is to have a function like so:
def random_file(dir):
file = os.path.join(dir, random.choice(os.listdir(dir)));
if os.path.isdir(file):
return random_file(file)
else:
return file
However this obviously biases the results depending on where they are in the tree and how many siblings are along side them in their directory so they end up with the following probabilities of being selected:
- /some/parent/dir/
- Foo.jpg - 50%
- sub_dir/ (50%)
- Bar.jpg - 16.6%
- Baz.jpg - 16.6%
- another_sub/ (16.6%)
- qux.png - 16.6%
The context for the function is in a background rotation app I'm writing, so the ability to filter out unwanted file extensions from being in the results would be a bonus (although I could simply force that by choosing again if it's not the file type I want... that gets messy if there's an abundance of files of the "wrong" type, though).