11

I want to load the data from the directory where I have around 5000 images (type 'png'). But it returns me an error saying that there are no images when obviusly there are images. This code:

width=int(wb-wa)
height=int(hb-ha)
directory = '/content/drive/My Drive/Colab Notebooks/Hair/Images'
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    directory, labels=densitat, label_mode='int',
    color_mode='rgb', batch_size=32, image_size=(width, height), shuffle=True, seed=1,
    validation_split=0.2, subset='training', follow_links = False)

Returns:

ValueError: Expected the lengths of `labels` to match the number of files in the target directory. len(labels) is 5588 while we found 0 files in /content/drive/My Drive/Colab Notebooks/Hair/Images.

I can see the images: Colab view of the folder structure with the images

Where is the problem? I need to use this function to load data in batchs as i have a large dataset

Lluis C
  • 438
  • 3
  • 10

3 Answers3

15

I have found the answer so I am posting in case it might help someone.

The problrem is the path, as I was using the path to the folder with the images whereas I should have used the directory (one folder above).

directory = '/content/drive/My Drive/Colab Notebooks/Hair'

Note that '/Hair' is the folder with my images.

Lluis C
  • 438
  • 3
  • 10
  • I have a similar problem using .tif files; however, moving to a directory above doesn't help. I used ```parent = ltifs_lib[0].parents[0].parents[0].parents[0].__str__()``` (where ```ltifs_lib[0]``` is the path to the first tif as given by ```.glob("**/*.tif")``` from ```pathlib```) applying ```.parents[0]``` between 0 and 3 times. Doesn't help. Moreover, I provided a list with the labels. – Manuel Popp May 13 '21 at 16:37
5

If the accepted solution above doesn't solve your problem, it could be because you are trying to load TIFF images with a .tif extension. It turns out the only allowed formats for image_dataset_from_directory are ('.bmp', '.gif', '.jpeg', '.jpg', '.png')

Vinay Pulim
  • 51
  • 1
  • 1
0

When you specify the labels parameter (as in the question), you need to follow the directory structure specified in the documentation (https://www.tensorflow.org/api_docs/python/tf/keras/utils/image_dataset_from_directory#expandable-1). You must give a parent directory containing as many directories as you are specifying in the labels parameter. And inside of each directory the images that correspond to that class.

When you specify a None as the labels parameter value, you can give the directory that contains all the images.

Pablo Johnson
  • 1,044
  • 15
  • 21