MTCNN_face_detection_alignment lagging in IP camera, Convention behind opencv cv2 videocapture frame matrix

Question

I am just trying to detect and recognize faces from the frame read through CV2 VideoCapture. For detection, using Tensorflow implementation of the face detection / alignment algorithm found at https://github.com/kpzhang93/MTCNN_face_detection_alignment. MTCNN Face detection process has no lag with builtin webcam and external camera connected with USB. However when it comes from IP camera there is a considerable lagging from detection algorithm. The algorithm takes more time to process single frame from ip camera than a frame from built in camera. Parameters like image resolution, image details can have a impact. To understand it further, looking to know what are all the parameters have impact other than resolution and image details.

Noticed frame matrices value differs for builtin webcam and IP camera. It differs with linux vs windows. how the frame matrices values calculated? What are the parameters define a frame matrices value? Wondering how the frame matrix value always 0 for the frame from builtin webcam with windows OS.

Builtin webcam(Windows) resolution 480.0 640.0. Frame matrices printed in python video_capture = cv2.VideoCapture(0) ret, frame = video_capture.read() print(frame).

[[[0 0 0]
  [0 0 0]
  [0 0 0]
  ...
  [0 0 0]
  [0 0 0]
  [0 0 0]]

 [[0 0 0]
  [0 0 0]
  [0 0 0]
  ...
  [0 0 0]
  [0 0 0]
  [0 0 0]]

 [[0 0 0]
  [0 0 0]
  [0 0 0]
  ...
  [0 0 0]
  [0 0 0]
  [0 0 0]]

 ...

 [[0 0 0]
  [0 0 0]
  [0 0 0]
  ...
  [0 0 0]
  [0 0 0]
  [0 0 0]]

 [[0 0 0]
  [0 0 0]
  [0 0 0]
  ...
  [0 0 0]
  [0 0 0]
  [0 0 0]]

 [[0 0 0]
  [0 0 0]
  [0 0 0]
  ...
  [0 0 0]
  [0 0 0]
  [0 0 0]]]

IP camera resolution 1080.0 1920.0. Similar way, printed below the IP camera Frame matrices

[[[ 85  81  64]
  [ 69  65  48]
  [ 61  57  40]
  ...
  [131  85  19]
  [131  85  19]
  [131  85  19]]

 [[ 74  70  53]
  [ 78  74  57]
  [ 70  66  49]
  ...
  [131  85  19]
  [131  85  19]
  [131  85  19]]

 [[ 72  68  51]
  [ 76  72  55]
  [ 73  69  52]
  ...
  [131  85  19]
  [131  85  19]
  [131  85  19]]

 ...

 [[ 74  74  67]
  [ 74  74  67]
  [ 75  75  68]
  ...
  [ 14  14  18]
  [ 21  21  25]
  [ 34  34  38]]

 [[ 74  74  67]
  [ 74  74  67]
  [ 75  75  68]
  ...
  [ 20  20  24]
  [ 27  27  31]
  [ 28  28  32]]

 [[ 74  74  67]
  [ 75  75  68]
  [ 75  75  68]
  ...
  [ 28  28  32]
  [ 28  28  32]
  [ 21  21  25]]]

score 2 · Answer 1 · answered Sep 06 '19 at 04:13

2

Your web camera might have first / last few lines empty for some technical reason. You may try to print average colors for every line and see, probably it would end up something like this:

np.mean( frame, axis=(1,2) )
[ 0, 0, 34, 42, .... 75, 129, 0, 0 ]

answered Sep 06 '19 at 04:13

lenik

23,228
4
34
43

Thanks. Yes, np.mean(frame.. shows the in between non zero values for webcam. Can you advise, 1) why the processing time of MTCNN detectface is considerably more for a frame taken from IP camera than the web cam? 2) What are the parameters determine the frame matrices value? – Jeyan Sep 06 '19 at 07:37
your IP camera has resolution 1920x1080 (~2M), web camera is 640x480 (0.3M) -- the number of pixels to process is about 7 times more for IP camera, hence the difference in the processing speed. if you use OpenCV, you may try to `half = cv2.pyrDown( frame )` to get your image scaled to the half of size and see how this affects the processing speed. – lenik Sep 06 '19 at 09:31
Was trying the same to reduce the IP camera captured frame resolution by 640*480 using cv2.resize to match with webcam image resolution. That reduce the lagging but not same as webcam image processing time. Lagging reduces if we further reduce the resolution, "but that affects face detection". Noticed the size of the image from webcam around 70KB vs 180KB from IP with same(640*480) resolution. It seems other parameters like embedding, color details, noise also affects the file size. Looking to scale down the IP frame image by size without affecting the resolution. – Jeyan Sep 06 '19 at 14:44
please, record the frames from IP camera into the file, and measure processing using that file. it very well might be, the problem is not with the actual file processing, but with your network setup or something else. – lenik Sep 07 '19 at 07:39
Ok. Let me try that. However, can you please let me know, what makes you think that the frame captured from ip camera by cv2 will have an issue with network setup. Just trying to understand. Captured both frames from cv2. Using same setup for processing webcam and ip camera frame. And testing just a single frame from each. – Jeyan Sep 07 '19 at 13:55
@noboundaries just a hunch and a bit of experience working with different cameras. sometimes it takes quite a while to capture a frame, and there's a good chance you include this into the "processing time" – lenik Sep 08 '19 at 03:38
Right, there was microseconds difference while capturing frames from a webcam and ipcam. Noted that separately, didn't include into detection processing time. As you pointed out earlier the detect face processing alone very much affected with resolution. Reducing resolution reduces lagging, but affects face detection if scaledown further from a certain point. Noticed that IP cam frame file size bigger around 100KB even after reducing its resolution similar to webcam frame. – Jeyan Sep 08 '19 at 17:37

score 0 · Answer 2 · answered Oct 18 '19 at 20:34

0

consider to do not use all the frames that your recieve from your ip camera like:

_, frame = cap.read()
_, frame = cap.read()
_, frame = cap.read()
processFrame(frame)

or just use a delay to receive not all the frames

answered Oct 18 '19 at 20:34

Reddy Tintaya

141
2
8

MTCNN_face_detection_alignment lagging in IP camera, Convention behind opencv cv2 videocapture frame matrix

2 Answers2

Linked