Most often, people don't consider the data rate (bandwidth) required to get their data across the wire.
1920 x 1080 pixels at 30 FPS with 24 bits per pixel would require 1.5 Gbit/s. That won't fit through USB 2.
That is why many webcams implement compression. A common compression is MJPEG, which is JPEG as video.
You can tell OpenCV to tell the media API (V4L, dshow, MSMF, AVFoundation...) to request that from the camera. It won't on its own, I have learned.
# cap = cv.VideoCapture(...)
cap.set(cv.CAP_PROP_FOURCC, cv.VideoWriter_fourcc(*"MJPG"))
Less data per frame means the transfer goes faster, allowing more frames per time.
When mixing that with other properties (FRAME_WIDTH etc), order matters. I believe the FOURCC setting needs to come first, but if that doesn't help, try other orders.
When a webcam notices that there isn't enough free data rate on the USB controller, it may decide to reduce frame size or frame rate, or apply stronger compression (worse pictures). When you attach multiple cameras to the same USB hub, this will happen. This negotiation may cause opening to take several seconds, a minute even. If opening the camera is slow, maybe the USB controller already has some bandwidth reservations for other devices.
What does affect frame rate is when your other code in that loop simply takes a lot of time! Can't work around that, except to make that part of the code faster.
What will not generally matter is moving the reading into its own thread. In the OpenCV case (with imshow
), a thread makes no sense at all. A thread only makes sense if there's anything else to be done with the time ordinarily spent waiting, besides processing the next frame (e.g. a GUI loop). Communication with consumers needs to be proper. Busy loops aren't the solution.
In the case of actual GUIs (not OpenCV's imshow
), the proper thing to do is a thread, but it needs to update the widget as a frame comes in. There must not be any spinning loop that repeatedly reads a variable and assigns a widget's image or anything.
The camera will produce frames at its own pace (no matter what you do), and put them into a queue. Reading a frame costs fairly little (but not nothing).
You must read from the camera, or else the frames queue up. If you read slower than the camera's frame rate, you'll see increasing latency, i.e. movement in front of the camera takes seconds to show up on your screen.
If you try to read quicker, the read() call will just block until a frame is available. Then you'll "waste" time in that call. What would you do with that time? You don't have new data yet, so there's nothing to compute, until that frame arrives. Might as well wait, right?