4

With os.listdir(some_dir), we can get all the files from some_dir, but sometimes, there would be 20M files(no sub-dirs) under some_dir, these would be a long time to return 20M strings from os.listdir().

(We don't think it's a wise option to put 20M files under a single directory, but it's really there and out of my control...)

Is it any other generator-like method to do the list operation like this: once find a file, yield it, we fetch it and then the next file.

I have tried os.walk(), it's really a generator-style tool, but it also call os.listdir() to do the list operation, and it can not handle unicode file names well (UTF-8 names along with GBK names).

coanor
  • 3,746
  • 4
  • 50
  • 67

1 Answers1

3

If you have python 3.5+ you can use os.scandir() see documentation for scandir

Saleem
  • 8,728
  • 2
  • 20
  • 34
  • 2
    And if you're on pre-3.5 Python, you can get the [`scandir` package](https://pypi.python.org/pypi/scandir) for earlier versions of Python (it's the code on which 3.5's `os.scandir` was based). – ShadowRanger Mar 12 '16 at 02:37
  • @ShadowRanger, very helpful comment. thumbs up! – Saleem Mar 12 '16 at 02:38