0

I’m working on a prototype application, primarily written in Python (2.7) for a research project. The application receives an mpegts video stream from a wireless device over UDP and then uses openCV to process/analyze the video frames. The mpeg decompression is done in my own C library which is basically just a Python bound wrapper around libavcodec. The application is a Python GLUT app which uses GLUT for the drawing and basic some event handling.

My problem is that when my openCV processing or drawing code taxes the CPU I get dropped UDP packets resulting in corrupted video frames. I know that UDP is an ‘unreliable’ protocol and that dropped packets are to be expected but unfortunately it’s what I have to work with.

Fortunately I have a decently fast machine that I’m running this on (MacBookPro11,3 2.8GHz quad core i7). My aim is to create a system using threads and queues whereby the consumption of UDP packets and video decompression is prioritized, so that every decompressed frame is received intact (barring actual network errors). If my main thread drawing or processing is unable to keep up with the video stream frame rate I’d like to drop entire decompressed frames so that the mpeg stream remains coherent. The alternative is to have individual UDP packets dropped, but this results in a corrupted image stream which is not restored for some period of time, ie. the receipt of the next I-frame, I believe. This is the situation that I’m trying to avoid.

I’ve already created such a system whereby a background thread is spawned, which does all of the work of creating the video decompression context and UDP socket. The background thread loops infinitely asking the decompressor for decoded frames which in turn calls a callback as necessary which which waits on more data from the socket (either using select, poll or blocking recv, I’ve tried all three). Upon the receipt of each new decompressed frame, the background thread adds the frames to a queue, which are consumed on the main thread as fast as it can handle. If the queue is full because the main thread consumer can’t keep up, then the newly decompressed frame just gets discarded and the process continues.

The problem that I’m having is that even using this system, a heavy load on the main thread is still causing UDP packets to be dropped. It’s almost as if the kernel packet scheduler which receives and buffers the incoming udp packets is running on my main thread, which is doing all of the drawing and processing of video frames (I only half know what I’m talking about here, re. packet scheduler). If I comment out all of the heavy processing and drawing code running on the main thread then I never get any dropped packets / mpeg decode errors.

I’ve already tried maxing out my socket receive buffer which helps somewhat but that also increases latency, which is also undesirable, and it ultimately just delays the problem.

So my question is, what can I do to ensure that all of my UDP packets are being consumed and passed to the decompressor as rapidly as possible, independent of the main thread cpu load?

I’ve tried setting the background thread’s thread priority to 1.0 but that doesn’t help. libavcodec by default is spawning 9 threads to handle the decompression, but I can optionally restrict it to 1, which I’ve tried, to ensure that all of the decompression happens on the same (high priority) thread. Looking at my cpu monitor I’ve got tons of overhead on my quad core processor (8 with hyperthreading, which I’ve tried turning on and off too).

I’d be happy to make kernel tweaks as root if necessary, as this is just a research project and not a shipping application.

Any advice?

TIA

hyperspasm
  • 1,243
  • 1
  • 16
  • 27
  • So how many threads are there, and how the queues are structured? – Nikolai Fetissov Aug 09 '15 at 20:14
  • There are two scenarios I’ve tried: a) main thread + one high priority thread for the socket and decompression b) main thread + one high priority thread for the socket and decompression + allowing the decompression to spawn 9 additional threads (with some unknown priority). – hyperspasm Aug 09 '15 at 22:07

1 Answers1

1

Python doesn't directly support setting thread priorities, but some people have had luck doing that with ctypes. BUT, due to the way the GIL works, that probably won't give you the best results anyway.

The best thing to do is probably to use multiprocessing to to place your UDP thread in a separate process, and use a queue to transport the video from that process to your main process.

To avoid deadlocks, you should start the UDP process before starting any threads (starting processes after starting threads running is problematic because the thread and IPC state is not copied properly to the subprocesses), and then, after starting the UDP process, you should lower the priority of the main process before starting any threads in it.

Here is an interesting paper that is not directly on-point, but gives some good information on Python (3, unfortunately) and thread priorities.

Community
  • 1
  • 1
Patrick Maupin
  • 8,024
  • 2
  • 23
  • 42
  • Terrific! this answer has some great leads. I didn’t know about the GIL, but that’s a good piece of info. I’ve seen other Python apps that spawn separate processes but didn’t know why. I’ll investigate further and come back later to mark this as correct if I can solve my problem using this info. Thanks! – hyperspasm Aug 09 '15 at 22:07
  • BTW, I'm using threading2 to set thread priorities https://pypi.python.org/pypi/threading2 – hyperspasm Aug 09 '15 at 22:10
  • You mean you were already doing that, or you are doing that now? (The reason I ask is that a regular program can only decrease its priority, not increase it, so if you were doing it before and it wasn't working, that may be why.) – Patrick Maupin Aug 10 '15 at 00:22
  • Yes, I was already doing that (priority=1.0) to the new thread that I created for the socket/decompression, but it didn't seem to make a difference. I tried decreasing the main thread's priority, which failed, I don't remember why, probably because it was created for me rather than by threading2. I could try decreasing the main thread's priority with ctypes, but it really seems like the GIL is probably my main problem. I'm working on adapting my strategy to use multiprocessing, I'm hopeful that this will be the ticket. – hyperspasm Aug 10 '15 at 02:58
  • Wonderful! Thanks so much for your advice. I was able to restructure my implementation to use multiprocessing as your described and that solved my problem. It was in fact (I believe) the GIL that was causing me trouble. Another key piece is sharedmem (https://pypi.python.org/pypi/sharedmem) which I'm using to pass video frames between processes. – hyperspasm Aug 16 '15 at 02:20