0

Apparently, this is a common issue and I have already checked around 10 questions with their answers such as: q1, q2, q3, etc. I followed the answers but the issue was not resolved.

I have a dataset consisting 99 images with different sizes. They have been read through cv2.imread and stored in ims. The labels are stored in labels. Part of my code is as follows:

labels = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3,
          3, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7,
          7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9]

first_sift = cv2.xfeatures2d.SIFT_create(contrastThreshold=0.02, sigma=1.35)

dictionarySize = 30
bow = cv2.BOWKMeansTrainer(dictionarySize)

for i in range(len(ims)):
    gray = cv2.cvtColor(ims[i], cv2.COLOR_RGB2GRAY)
    kp, des = first_sift.detectAndCompute(gray, None)

    bow.add(des)

d = bow.cluster()

FLANN_INDEX_KDTREE = 0
index_params = d(algorithm=FLANN_INDEX_KDTREE, trees=5)
search_params = d(checks=50)
flann = cv2.FlannBasedMatcher(index_params, search_params)
second_sift = cv2.xfeatures2d.SIFT_create()

extractor = cv2.BOWImgDescriptorExtractor(second_sift, cv2.BFMatcher(cv2.NORM_L2))
extractor.setVocabulary(d)

feats = []
for i in range(len(ims)):

    gray = cv2.cvtColor(ims[i], cv2.COLOR_RGB2GRAY)
    result = extractor.compute(gray, first_sift.detect(gray))
    feats.append(result)

num = len(ims)

X = np.array(feats).reshape(num, dictionarySize)
y = np.array(labels)

labels_num = np.max(y) + 1

X, y = shuffle(X, y)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)

X_train = (X_train - np.min(X)) / (np.max(X) - np.min(X))
X_test = (X_test - np.min(X)) / (np.max(X) - np.min(X))

y_train_encode = keras.utils.to_categorical(y_train, labels_num)
y_test_encode = keras.utils.to_categorical(y_test, labels_num)

model = Sequential()

model.add(Conv2D(32, (3, 3), padding="same", activation="relu", input_shape=(None, None, 1)))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.summary()

model.compile(loss='categorical_crossentropy', optimizer=Adam(), metrics=['accuracy'])

hist = model.fit(X_train, y_train_encode, batch_size=32, epochs=50, verbose=1,
                 validation_data=(X_test, y_test_encode))

Following is the model summery:

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, None, None, 32)    320       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, None, None, 32)    0         
=================================================================
Total params: 320
Trainable params: 320
Non-trainable params: 0
_________________________________________________________________

When I run the code I get the following error:

ValueError: Input 0 of layer sequential is incompatible with the layer: : expected min_ndim=4, found ndim=2. Full shape received: [None, 30]

I should also add:

X shape is (99, 30)

y shape is (99,)

Number of labels is 10

X_train.shape is (66, 30)

X_test.shape is (33, 30)

Bsh
  • 345
  • 1
  • 9
  • 1
    What does `30` in your input shape refer to? ..you have 99 images (n_samples), and every image should have the shape (height, width, channels), so your input shape should have 4 dim = (n_samples, height, width, channels) as shown in the error message. – Merna Mustafa Jan 15 '21 at 07:51
  • @MernaMustafa 30 refers to the `dictionarySize` that is the number of clusters of features extracted. – Bsh Jan 15 '21 at 07:59
  • Your inputs are not images (or 3d dimensional), you can't use Conv2D with that kind of input. It's is the [documentation](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2D) : `Input shape: 4+D tensor with shape: batch_shape + (channels, rows, cols)` – Lescurel Jan 15 '21 at 09:55
  • Conv2D layer expects input of shape `4+D tensor with shape: batch_shape + (channels, rows, cols)`. Channels will be 3 if input images are in RGB else it will be 1 for Grey scale image. Add the value of batch_size to X_train. Thanks! –  Jul 06 '21 at 02:16

0 Answers0