1

I am working on face recognition which involves object detection too. I am using Yolov5 for object detection and Facenet for face recognition. I am getting very low fps (~0.400) which makes the task laggy. So how do I limit the fps for first N frames for few preliminary tasks and then instead of 30 frames per second I want to take only 1 frame per second for recognition task?

I tried using cap.set(cv2.CAP_PROP_FPS, 5) but I get an error saying 'Can't grab a frame.'

with tf.Graph().as_default():
    gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.6)
    sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options, log_device_placement=False))
    with sess.as_default():
        pnet, rnet, onet = detect_face.create_mtcnn(sess, './models/')

        minsize = 20  # minimum size of face
        threshold = [0.6, 0.7, 0.7]  # three steps's threshold
        factor = 0.709  # scale factor
        margin = 44
        frame_interval = 3
        batch_size = 1000
        image_size = 182
        input_image_size = 160

        print('Loading feature extraction model')
        modeldir = './models/'
        facenet.load_model(modeldir)

        images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0")
        embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0")
        phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0")
        embedding_size = embeddings.get_shape()[1]

        classifier_filename = './myclassifier/my_classifier.pkl'
        classifier_filename_exp = os.path.expanduser(classifier_filename)
        with open(classifier_filename_exp, 'rb') as infile:
            (model, class_names) = pickle.load(infile)
            print('load classifier file-> %s' % type(class_names))
        HumanNames = class_names
        video_capture = cv2.VideoCapture(0)
        c = 0

        print('Start!')
        prevTime = 0
        
        FPSLimit = 10
        StartTime = time.time()
        
        
        while True:
            ret, frame = video_capture.read()

            # frame = cv2.resize(frame, (0,0), fx=0.5, fy=0.5)    #resize frame (optional)
            
            curTime = time.time()    # calcq fps
            timeF = frame_interval
            #if int(curTime - StartTime) > FPSLimit:
            if (c % timeF == 0):
                  DETECTION TASK
                  if nrof_faces > 0:
                      OBJECT DETECTION TASKS
                      RECOGNITION TASK


harry r
  • 786
  • 2
  • 6
  • 19

1 Answers1

1

I tried to add this as a comment, but it got too long. This isn't exactly an answer to your question, but it might help for an approach. I would start with identifying the delay by measuring the milliseconds for these tasks: DETECTION TASK, OBJECT DETECTION TASKS, RECOGNITION TASK in your code above.

You should be able to do detection in real-time, or at least 5-10 FPS or better, which may suit your needs. If recognition is your bottleneck, I would do that on another thread. That works because you don't need to detect the same face over and over. If you have 30 FPS and the same face in the frame for 5 seconds, then only perform recognition on that face once, not 5x30 times.

Use multi object tracking to track objects (faces) across frames without having to perform face recognition on each one. This tracking algorithm is easy to implement and works fast. So keep track of objects across frames, then submit for recognition only once per track - and do that on another thread.

j2abro
  • 733
  • 8
  • 17