0

I am new to Deep Learning and Keras. I have created a model that trains on the ASL(American Sign Language) dataset with nearly 80,000 training images and 1500 testing images. I have also appended some more classes ie. Hand sign numbers from 0-9. So, in total, I have 39 classes (0-9 and A-Z). My task is to training this dataset and use it for prediction. My input for prediction would be a frame from a webcam where I'll be displaying the hand sign.

My Keras Model

classifier = Sequential()

classifier.add(Conv2D(32, (3, 3), input_shape = (100, 100, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))

classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))

classifier.add(Flatten())

classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 39, activation = 'softmax'))

classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])



from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

test_datagen = ImageDataGenerator(rescale = 1./255)

training_set = train_datagen.flow_from_directory('train',
                                                 target_size = (100,100),
                                                 batch_size = 128,
                                                 class_mode = 'categorical')

test_set = test_datagen.flow_from_directory('test',
                                            target_size = (100, 100),
                                            batch_size = 128,
                                            class_mode = 'categorical')

classifier.fit_generator(training_set,
                         steps_per_epoch = 88534,
                         epochs = 10,
                         validation_data = test_set,
                         validation_steps = 1418)

The ASL dataset images are of size 200x200 and the number sign datasets are of size 64x64. After running for 5 epocs with validation accuracy 96% I am still not able to get good predictions when I run it on a video.

python program for frames of video

classifier = load_model('asl_original.h5')
classifier.compile(loss='binary_crossentropy',optimizer='rmsprop',metrics=['accuracy'])

cam = cv2.VideoCapture(0)

while(1):
    try:
        ret, frame = cam.read()
        frame = cv2.flip(frame,1)
        roi = frame[100:400,200:500]
        cv2.rectangle(frame,(200,100),(500,400),(0,255,0),2) 
        cv2.imshow('frame',frame) 
        cv2.imshow('roi',roi)
        img = cv2.resize(roi,(100,100))
        img = np.reshape(img,[1,100,100,3]) 
        classes = classifier.predict_classes(img)
        print(classes)

        if cv2.waitKey(1) & 0xFF == ord('q'):
            break


    except Exception:
        traceback.print_exc()
        pass

I Don't understand why am I not able to get accurate predictions even after training on such a large dataset. What changes do I need to make so that I get accurate predictions for all my 39 classes.

Link for the datasets. ASL DATASET and Hand sign for numbers

Community
  • 1
  • 1
LalaLand
  • 139
  • 3
  • 14
  • This has nothing to do with the title: "unnecessary folders getting zipped during zipping file using python" You need to make your titles reflective of the general question: https://meta.stackexchange.com/questions/10647/how-do-i-write-a-good-title – tnknepp Feb 11 '20 at 16:47
  • sorry, Wrong title – LalaLand Feb 11 '20 at 16:59

1 Answers1

2

In the classifier.compile you use the loss='binary_crossentropy' that is used only where the labels are binary (only two classes). When you have multiclass classification you must use the appropriate loss function based on the numbers and types of your labels (i.e. 'sparse_categorical_crossentropy').

Try to read this useful blog post that explains every loss function in details.

  • Would that improve or make the predictions accurate? – LalaLand Feb 11 '20 at 17:29
  • You've to try. The accuracy in training is biased; read [this](https://stackoverflow.com/questions/42081257/why-binary-crossentropy-and-categorical-crossentropy-give-different-performances) answer that explains why you're getting a high accuracy in training despite the wrong loss. – Simone Coslovich Feb 11 '20 at 17:53
  • while using 'sparse_categorical_crossentropy' I get this error `ValueError: Error when checking target: expected dense_6 to have shape (1,) but got array with shape (39,) ` – LalaLand Feb 12 '20 at 17:09
  • Sorry, try "categorical_crossentropy". I've replicated your experiment and after 10 epochs the validation accuracy, after the split of the train generator, is around 55/56%. This result is more real than the previous one. – Simone Coslovich Feb 13 '20 at 09:02