How do I get the images from a path to pass in model.fit method?

Question

I'm getting a list of all my training images from a path with this method:

def ReadImages(Path):
    ImageList = list()
    LabelList = list()

    # Get all subdirectories
    FolderList = os.listdir(Path)

    # Loop over each directory
    for File in FolderList:
        if(os.path.isdir(Path + os.path.sep + File)):
            for Image in os.listdir(Path + os.path.sep + File):
                # Add the image path to the list
                ImageList.append(Path + os.path.sep + File + os.path.sep + Image)

                # Add a label for each image and remove the file extension
                LabelList.append(File.split(".")[0])
        else:
            ImageList.append(Path + os.path.sep + File)

            # Add a label for each image and remove the file extension
            LabelList.append(File.split(".")[0])

    return ImageList, LabelList

And now I want to call the keras method 'model.fit(data,labels,epochs, bs)' with that data

model = Sequential()
model.add(Dense(32, activation='tanh', input_dim=1))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])

data, labels = ReadImages(TRAIN_DIR)

# Train the model, iterating on the data in batches of 32 samples
model.fit(np.array(data), np.array(labels), epochs=10, batch_size=32)

But it shows this error:

return array(a, dtype, copy=False, order=order)

ValueError: could not convert string to float: 'train_data/\\non_pdr\\im0008.ppm'

How do I convert my list of paths in a list to feed my train model data?

My files are like

(['train_data/non_pdr\im0001.ppm', 'train_data/non_pdr\im0002.ppm', 'train_data/non_pdr\im0003.ppm', 'train_data/non_pdr\im0004.ppm', 'train_data/non_pdr\im0005.ppm', 'train_data/non_pdr\im0006.ppm', 'train_data/non_pdr\im0007.ppm', 'train_data/non_pdr\im0008.ppm', 'train_data/non_pdr\im0009.ppm', 'train_data/non_pdr\im0010.ppm', 'train_data/non_pdr\im0011.ppm', 'train_data/non_pdr\im0012.ppm', 'train_data/non_pdr\im0013.ppm', 'train_data/non_pdr\im0014.ppm', 'train_data/non_pdr\im0015.ppm', 'train_data/non_pdr\im0016.ppm', 'train_data/non_pdr\im0017.ppm', 'train_data/non_pdr\im0018.ppm', 'train_data/non_pdr\im0019.ppm', 'train_data/non_pdr\im0020.ppm', 'train_data/non_pdr\im0021.ppm', 'train_data/non_pdr\im0022.ppm', ...

you're not actually loading any images here, just listing directory entries. maybe you could could use [Pillow](https://pillow.readthedocs.io/) or [CV2](https://pypi.org/project/opencv-python/) to load the images — Sam Mason, Aug 19 '19 at 17:53

it's-yer-boy-chet · Accepted Answer · 2019-08-19T18:36:41.430

Sorry, my old answer (below) still stands as best practice, but this is more of an existential issue. You're passing an array of path items to your Sequential model. What you need is a list of image arrays, so you need to load those images (how to convert an RGB image to numpy array?).

See the docs for the model for the variable types: https://keras.io/models/sequential/

Old answer

You've got a mix of slashes in your ValueError traceback, I'm not sure if that's from the actual file names or something to do with your path_sep variable. It's better to use the os module built-in path joining function os.path.join. That might be the solution to your issue. The question would be more complete with a list of sample directory files.

# Instead of this:
Path + os.path.sep + File 

# Do this:
os.path.join(Path, File)

Also, bonus:

# This might not work depending on the filename:
File.split('.')[0] # what if the file is named 2019.08.19_file.txt?

# Try:
os.path.splitext(File)[0]

Now, I faced this error ``` 'with shape ' + str(data_shape)) ValueError: Error when checking input: expected dense_19_input to have 2 dimensions, but got array with shape (391, 605, 700, 3) ``` — 0nroth1, Aug 19 '19 at 19:18

How do I get the images from a path to pass in model.fit method?

1 Answers1

Old answer