4

The software in question is a native C++/MFC application that receives a large amount of data over UDP and then processes the data for display, sound output, and writing to disk among other things. I first encountered the problem when the application's CHM help document was launched from its help menu and then I clicked around the help document while gathering data from the hardware. To replicate this, an AutoHotkey script was used to rapidly click around in the help document while the application was running. As soon as any sound occurred on the system, I started getting errors.

If I have the sound card completely disabled, everything processes fine with no errors, though sound output is obviously disabled. However, if I have sound playing (in this application, a different application or even just the beep from a message box) I get thousands of dropped packets (we know this because each packet is timestamped). As a second test, I didn't use my application at all and just used Wireshark to monitor incoming packets from the hardware. Sure enough, whenever a sound played in Windows, we had dropped packets. In fact, sound doesn't even have to be actively playing to cause the error. If I simply create a buffer (using DirectSound8) and never start playing, I still get these errors.

This occurs on multiple PCs with multiple combinations of network cards (both fiber optic and RJ45) and sound cards (both integrated and separate cards). I've also tried different driver versions for each NIC and sound card. All tests have been on Windows 7 32bit. Since my application uses DirectSound for audio, I've tried different CooperativeLevels (normal operation is DSSCL_PRIORITY) with no success.

At this point, I'm pretty convinced it has nothing to do with my application and was wondering if anyone had any idea what could be causing this problem before I started dealing with the hardware vendors and/or Microsoft.

bsruth
  • 5,372
  • 6
  • 35
  • 44
  • 1
    It is known that Microsoft built some weird anti-feature into the Windows Vista kernel that will degrade I/O performance preventatively to make sure that multimedia applications (windows media player, directX) get 100% responsiveness. I don't know if that also means packet loss with UDP. Read this lame justification for the method: http://blogs.technet.com/b/markrussinovich/archive/2007/08/27/1833290.aspx One of the comments summerizes this quite well: "Seems to me Microsoft tried to 'fix' something that wasn't broken." – ypnos Jan 11 '11 at 18:59
  • You are aware that this is the problem with UDP, correct? It is an unreliable delivery method and therefore error checking must be included in the protocol external to UDP. Or, just use TCP. 'In addition' you have the information presented by ypnos - to which I have no exposure/visibility) – KevinDTimm Jan 11 '11 at 19:12
  • @KevinDTimm - I am aware that UDP is, by definition, unreliable. Since it operates without errors indefinitely in the absence of sound (and I don't have any way to change the hardware), I'm looking for solutions to the problem caused by the sound. ypnos has given a good starting point, now to find a workaround. – bsruth Jan 11 '11 at 20:32
  • @ypnos - Your comment led me down the right path and I think solved the problem. Turn your comment into an answer and I'll mark it accepted. The key was to disable network throttling by setting the NetworkThrottlingIndex to 0xFFFFFFFF. – bsruth Jan 11 '11 at 22:28
  • haha took me a while to realise the other answer was from yourself. :) thumbs up for taking the time to give a good help to future readers! – ypnos Jan 12 '11 at 11:34

2 Answers2

6

It turns out that this behavior is by design. Windows Vista and later implemented something called the Multimedia Class Scheduler service (MMCSS) that is intended to make all multimedia playback as smooth as possible. Since multimedia playback relies on hardware interrupts to ensure smooth playback, any competing interrupts will cause problems. One of the major hardware interrupt sources is network traffic. Because of this, Microsoft decided to throttle the network traffic when a program was running under MMCSS.

I guess this was a big deal back in 2007 when Vista came out, but I missed it. There was an article by Mark Russinovich (thanks ypnos) describing MMCSS. It seems that the my entire problem boiled down to this:

Because the standard Ethernet frame size is about 1500 bytes, a limit of 10,000 packets per second equals a maximum throughput of roughly 15MB/s. 100Mb networks can handle at most 12MB/s, so if your system is on a 100Mb network, you typically won’t see any slowdown. However, if you have a 1Gb network infrastructure and both the sending system and your Vista receiving system have 1Gb network adapters, you’ll see throughput drop to roughly 15%. Further, there’s an unfortunate bug in the NDIS throttling code that magnifies throttling if you have multiple NICs. If you have a system with both wireless and wired adapters, for instance, NDIS will process at most 8000 packets per second, and with three adapters it will process a maximum of 6000 packets per second. 6000 packets per second equals 9MB/s, a limit that’s visible even on 100Mb networks.

I haven't verified that the multiple adapter bug still exists in Windows 7 or Vista SP1, but it is something to look for if you are running into problems.

From the comments on Russinovich's post, I found that Vista SP1 introduced some registry settings that allowed one to adjust how MMCSS affects Windows. Specifically the NetworkThrottlingIndex key.

The solution to my issue was to completely disable network throttling by setting the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Multimedia\SystemProfile\NetworkThrottlingIndex key to 0xFFFFFFFF and then rebooting. This completely disables the network throttling portion of MMCSS. I had tried simply upping the value to 70, but it didn't stop causing errors until I completely disabled it.

Thus far I have not seen any adverse effects on other multimedia applications (nor the video capture and audio output portions of my own application) from this change. I will report back here if that changes.

Community
  • 1
  • 1
bsruth
  • 5,372
  • 6
  • 35
  • 44
1

It is known that Microsoft built some weird anti-feature into the Windows Vista kernel that will degrade I/O performance preventatively to make sure that multimedia applications (windows media player, directX) get 100% responsiveness. I don't know if that also means packet loss with UDP. Read this lame justification for the method: http://blogs.technet.com/b/markrussinovich/archive/2007/08/27/1833290.aspx

One of the comments there summarizes this quite well: "Seems to me Microsoft tried to 'fix' something that wasn't broken."

ypnos
  • 50,202
  • 14
  • 95
  • 141
  • 1
    Actually it was quite broken before the change (before the change, it was impossible (for certain network adapters) to watch a hi-def video over the network without having the video glitch). For Vista SP1, the fix was modified to remove the impact to network performance. – Larry Osterman Jan 12 '11 at 01:52
  • @Larry Osterman - So is what I'm doing the proper way to solve this problem? Is the network throttling still necessary given current hardware? – bsruth Jan 12 '11 at 14:30
  • @bsruth: I'm not sure. I was just commenting on ypnos's comment that something wasn't broken. A lot of this depends on your network card - some network cards (especially gigabit cards) parse the protocol elements in the card before passing them to the OS. Others rely on the OS to parse the entire protocol. The network stack tries very hard not to consume too much wall time while processing incoming packets, if it sees that it's taking up too much time, it may drop them on the floor. One thing you didn't mention was how large the UDP packets are - do they require more than one TSDU? – Larry Osterman Jan 13 '11 at 03:28
  • @Larry Osterman - If you feel it more appropriate to discuss this over email, please let me know. From the person who designed the hardware: "We normally send standard UDP packets. For large systems (channel count up to 512), we must enable jumbo packets for the network card. I believe, the definition of a single TSDU still includes a jumbo packet if that’s how the application defines it. So, we are only using one TSDU." Hope that helps. – bsruth Jan 13 '11 at 17:55