0

Lets say my Windows Server 2012 R2 machine has 8 logical cores. Using thread/process affinity, process priority class, and thread priority I can set 7 application threads to run on cores 1-7 and set their priority level to real-time/time-critical so they preempt all OS threads and run uninterrupted on those cores. The result of this should be that the OS can only run threads on core 0, and does so without any application threads to get in the way.

If my understanding of affinity and priority is correct and this scenario is possible, would that be a problem for the OS? Would any system behavior be affected? Is one core enough for the OS?

The reason for doing this would be to eliminate context switches and ensure the environment always has the same 7 worker threads running in parallel without interruption and without cache conflicts.

Michael220
  • 193
  • 12
  • This would be a very bad idea. All you're doing is taking choices away from the scheduler. That might make sense if you know something unusual about your application that the scheduler doesn't. But that doesn't seem to be the case here. For just one example of how this could suck, if those 8 logical cores are implemented on 4 physical cores, you could wind up forcing the only two threads that have work to do to be stuck on a single physical core. – David Schwartz Sep 03 '15 at 20:25
  • Have you actually benchmarked the performance when you do something like that? And compared it to how fast things run when you don't? – Andrew Henle Sep 03 '15 at 20:34
  • @AndrewHenle, I've only set up a test program to ensure I can limit the OS to one core. I don't know enough about OS internals to know whether or not I can run my server like this safely over the long run, so I haven't actually done any testing. My app is a server that runs 24/7 and needs to handle a high and constant load. All threads will always be working. – Michael220 Sep 03 '15 at 20:38
  • @DavidSchwartz, good point. I would then change the scenario to 8 physical cores, or whatever the machine has. My server runs alone, there are no other programs running except what the OS brings with it. – Michael220 Sep 03 '15 at 20:40
  • @Michael220 - Then do some benchmarking and see if you can beat the OS scheduler. – Andrew Henle Sep 03 '15 at 20:41
  • @AndrewHenle, my question is about the stability of the OS in this scenario, not necessarily my app's performance. I want to make sure I don't starve OS services of any necessary cpu time. My assumption is that one core is enough for all the work the OS does behind the scenes, especially if no application threads are running on that core. – Michael220 Sep 03 '15 at 20:47
  • 1
    I believe that the OS should in principle be stable, because kernel interrupt routines, APCs, etc., will still run - they always run in the context of whichever user-mode thread happens to be live, it doesn't matter whether that's yours or someone else's. As far as I know there is no way to lock system threads to a particular core, so that shouldn't be a problem. (Important system threads should be higher priority than you anyway.) And Windows still runs perfectly well on single-core machines, so that aspect isn't an issue. – Harry Johnston Sep 03 '15 at 21:49
  • 1
    However, you might conceivably run into trouble with buggy third-party device drivers. I can't offhand think of any sort of bug you could program into a driver that would trigger only in this scenario, but that doesn't mean there isn't one. Programmers are so very ingenious when it comes to new and unexpected bugs. :-) – Harry Johnston Sep 03 '15 at 21:50
  • What are the consequences of a context switch or a cache miss in your application? – Jason Sep 03 '15 at 22:40
  • If these are the only threads, why do you think the OS won't do this for you? Why would the OS move your threads around if they are the only busy ones? The OS is not stupid. What you are proposing to do is likely to be worse. – David Heffernan Sep 03 '15 at 22:46
  • @DavidHeffernan Realtime priority should be a sufficient hint to the scheduler, but a cache miss may still incur sub-optimal NUMA costs or sub-optimal inter-core communication. etc... Measuring is the *only* way to be certain. – Jason Sep 03 '15 at 23:35
  • @Jason, merely latency, making this an optimization play. – Michael220 Sep 03 '15 at 23:36
  • @DavidHeffernan, good point, but if that will happen anyway then forcing it should have no effect. This would ensure trivial OS background tasks don't interfere with the app code on the random times two OS threads where scheduled concurrently. – Michael220 Sep 03 '15 at 23:40
  • David Schwartz and David Hefferman are right. You'll most likely make latency for your application worse (and pessimize overall throughput). I'm not familiar with the windows scheduler implementation though, so I would recommend testing and measuring the specific things you need to optimize. I think `xperf` or `etw` are supposed to be good resources for windows. Here's [another](http://stackoverflow.com/questions/25933912/should-i-bind-spinning-thread-to-the-certain-core) question that might be worth reading. – Jason Sep 03 '15 at 23:48
  • @Jason, thanks for the link to that question, I couldn't find much on my specific question when I initially looked. Surt's answer still seems to suggest there are limited use cases where locking to a core could benefit latency. I guess I'll just need to test and tweak for my specific app. – Michael220 Sep 04 '15 at 17:14

1 Answers1

0

The whole point of schedulers in OSes - which is a highly active field of research - is to create the illusion for each thread/process that it gets all of the CPU's time. As @David Schwartz pointed out, you're denying the scheduler the ability to do that.

So yes, it would likely be problematic for the OS and system behavior - the system might be unable to respond to interrupts in a timely fashion, and certain kernel-related tasks - such as writing to disk - would be delayed. This could in turn lead to potential loss of data (in the event that an application/the system crashes).

tonysdg
  • 1,335
  • 11
  • 32
  • ya, disk activity is what mainly comes to mind for me. I can say that my app will be the only user program running and that I will be doing my own file buffering (FILE_FLAG_NO_BUFFERING). – Michael220 Sep 03 '15 at 20:50
  • So is one core not enough for the OS to handle interrupts and other kernel-related tasks? – Michael220 Sep 04 '15 at 17:15
  • @Michael220 In theory? If the OS can run on a single core uniprocessor, than yes, it should be. In practice? I'd guess that most OSes today - Windows included - would be sluggish at best, unresponsive on average, and irrecoverable at worst. But the only advice I can offer for sure is the advice a professor once offered me - DTFE haha :) – tonysdg Sep 04 '15 at 18:15
  • to get better response, run all connections to the server in threads that are entities of a 'thread pool' and let the scheduler worry about which CPU to use and how to handle context switches. That way the server can handle thousands of connections where the proposed method can only handle 7 connections – user3629249 Sep 04 '15 at 19:33