2

My data-set have more than 1000 folders and i am using os.walk for recursively accessing each image inside every folder. os.walk went well for few folders but loading 1000 more folders its very slow. I need alternate solution or if anything that can handle this issue.
You can see the code something similar:

def run(dirname, img):
    data = img.load()
    width, height = img.size
    output_img = Image.new("RGB", (100, 100))
    Zero=np.zeros(shape=(100, 100), dtype=np.uint8)

    for (x, y) in labels:
        component = uf.find(labels[(x, y)])
        labels[(x, y)] = component
        path = 'D:/Python36/Fold/'
        if labels[(x, y)] == 0:
            Zero[y][x] = 255
            Zeroth = Image.fromarray(Zero)
            Zeroth.save(os.path.join(dirname, 'Zero.png'), 'png')


def main():
    path = "D:/Python36/Fold/"
    for root, dirs, files in os.walk(path):
        for file_ in files:
            img = Image.open(os.path.join(root, file_))
            img = img.point(lambda p: p > 190 and 255)
            img = img.convert('1')
            (labels, output_img) = run(root, img)


if __name__ == "__main__":
    main()
Mun Says
  • 97
  • 1
  • 7

2 Answers2

2

You question is not clear but Python has os.scandir that don't call stat on each file and is much faster. Related doc.

PyPI package for old Python versions (<3.5) https://pypi.python.org/pypi/scandir.

Arman Ordookhani
  • 6,031
  • 28
  • 41
  • @MunSays `os.scandir` can do roughly the same thing as `os.walk` you'll have to read the docs (which Arman has provided) to see how it works. – Aaron Dec 21 '17 at 20:52
1

debug your code and execute file sequentially you can use sorted(os.walk(path)) and check at which file your code slows down.
check this it might help how os.walk works

user3768070
  • 176
  • 1
  • 1
  • 9