Generating a list of 50000 elements

Question

I am trying to make a list containing all the images in my dataset. There will be 50000 elements in this list.

images = []
for cls in classes:
      samples_per_class = len(os.listdir(PATH_FOR_EACH_CLASS)
      for i in range(samples_per_class):
          image_path = os.path.join(
              directory,
              cls,
              str(i+1).zfill(4) + ".png"
          )
          images.append(image_path)

But I found it's very slow to set up this big list. Is there any more efficient way to deal with big list initialization?

What is 'very slow'? What would you expect? There is not much happening in your code that could take time, apart from disk access. — Thierry Lathuille, May 12 '20 at 17:15
See [How can you profile a Python script?](https://stackoverflow.com/questions/582336/how-can-you-profile-a-python-script) This will tell you where your script is spending most of its time — which you ***may*** be able to optimize depending on where that is. — martineau, May 12 '20 at 17:19
`os.listdir` is a call that can be quite slow on some machine regarding you OS and your hardware. How many classes there are in average and how many samples/class? Is `PATH_FOR_EACH_CLASS` a constant or a hidden complex expression you did not want to put here? — Jérôme Richard, May 12 '20 at 18:49

score 1 · Accepted Answer · answered Oct 17 '21 at 17:11

It is very likely that the problem of inefficiency comes from the disk access. If you try to access the data set from cloud drive, then this would slow down the program. Save your dataset in your local Disc instead of cloud drive. Hope this would help!

score 0 · Answer 2 · answered May 12 '20 at 17:35

You can time it - see below. The code below took 1.16 seconds

import timeit
from timeit import default_timer as timer
import os
start = timer()
images = []
for i in range(1000000):
        image_path = os.path.join("/home","colinpaice")
        images.append(image_path) 
end = timer()
print(end - start)

Generating a list of 50000 elements

2 Answers2