1

I have a directory with more than 100k files in it. I need to loop through them and perform operations. I don't want to load the whole list of files in memory, instead of that, I want to traverse synchronously. What is the best way to achieve that in Python?

Edit:

This question isn't similar to my question as I don't want to load all the filenames into the memory at once.

snehanshu.js
  • 137
  • 1
  • 12

2 Answers2

1

Pathlib.iterdir() offers a generator to iterate through directories, which reduces memory consumption:

import sys
import pathlib
import os

path = '/cache/srtm'
pl = pathlib.Path(path).iterdir()
oslb = os.listdir(path)
print(type(pl))
print (type(oslb))

print ('pathlib.iter: %s' % sys.getsizeof(pl))
print ('os.listdir: %s' % sys.getsizeof(oslb))

Prints:

<class 'generator'>
<class 'list'>
pathlib.iter: 88
os.listdir: 124920
Maurice Meyer
  • 17,279
  • 4
  • 30
  • 47
0

This is how you would loop through a list of files in a directory assuming that you have a directory path as an str object in a variable called myDirectory

import os

directory = os.fsencode(myDirectory)

for file in os.listdir(directory):
     filename = os.fsdecode(file)
     # do opperations with filename

Alternatively you could use pathlib

from pathlib import Path

pathlist = Path(myDirectory)
for path in pathlist:
 filename = str(path)
 # Do opperations with filename
jaco
  • 43
  • 11
  • Thanks, but are you sure it won't load all 100K filenames in the memory at once? – snehanshu.js Oct 06 '19 at 11:13
  • The class pathlib is commonly used and I'm sure it is optimised not to waste memory, however, to be on the save side, I suggest checking your memory usage while running this code to see by how much it increases. I'm unaware of any other ways of doing this that does not load everything into memory – jaco Oct 06 '19 at 11:50
  • tried to propose an edit here, but the queue is full: the pathlib example does not work as is. Instead, you need to do `for path in pathlist.iter_dir():` – PirateNinjas Apr 08 '22 at 12:19