2

I am making a program using the Sleep command via Windows.h, and am experiencing a frustrating difference between running my program on Windows 10 instead of Windows 7. I simplified my program to the program below which exhibits the same behavior as my more complicated program.

On Windows 7 this 5000 count loop runs with the Sleep function at 1ms. This takes 5 seconds to complete.

On Windows 10 when I run the exact same program (exact same binary executable file), this program takes almost a minute to complete.

For my application this is completely unacceptable as I need to have the 1ms timing delay in order to interact with hardware I am using.

I also tried a suggestion from another post to use the select() command (via winsock2), but that command did not work to delay 1ms either. I have tried this program on multiple Windows 7 and Windows 10 PC's and the root cause of the issue always points to using Windows 10 instead of Windows 7. The program always runs within ~5 seconds on numerous Windows 7 PC's, and on the multiple Windows 10 PC's that I have tested the duration has been much longer ~60 seconds.

I have been using Microsoft Visual Studio Express 2010 (C/C++) as well as Microsoft Visual Studio Express 2017 (C/C++) to compile the programs. The version of visual studio does not influence the results.

I have also changed the compile options from 'Debug' to 'Release' and tried to optimize the compiler but this will not help either.

Any suggestions would be greatly appreciated.

#include <stdio.h>
#include <Windows.h>

#define LOOP_COUNT      5000

int main()
{
    int i = 0;

    for (i; i < LOOP_COUNT; i++){
        Sleep(1);
    }

    return 0;
}
  • 8
    Does this answer your question? [WinAPI Sleep() function call sleeps for longer than expected](https://stackoverflow.com/questions/9518106/winapi-sleep-function-call-sleeps-for-longer-than-expected) – Retired Ninja Jan 07 '22 at 19:31
  • 4
    Windows is not a real-time OS, and `Sleep()` does not have 1ms precision. For that kind of precision, use a high-performance counter (see the `QueryPerformance(Counter|Frequency)` functions), or even just a spin counter. – Remy Lebeau Jan 07 '22 at 19:31
  • 2
    There will also be a marked difference in runtimes between running a release executable as opposed to a debug executable. On my system, its typically a 10x difference. – ryyker Jan 07 '22 at 19:32
  • _[Any suggestions...](https://stackoverflow.com/questions/14812233/sleeping-for-milliseconds-on-windows-linux-solaris-hp-ux-ibm-aix-vxworks-w)_ – ryyker Jan 07 '22 at 19:37
  • Thank you for the suggestions, I will try and see if the hardware I am using has some type of buffer I can use to grab the data at a slower rate. – Zachariah Rabatah Jan 07 '22 at 19:44
  • `release executable as opposed to a debug executable` it does not matter how they are called. The optimizations matter (and you can set higher level on debug than the release) – 0___________ Jan 07 '22 at 20:09
  • On my Windows 7 machine it took 50 seconds to run. I suggest looking at the *running* time tick count returned by `clock()` instead of trying to time an *interval*. – Weather Vane Jan 07 '22 at 20:16
  • @0___________ yes I have built it as a release executable instead of debug but this does not help much. – Zachariah Rabatah Jan 07 '22 at 20:33
  • 2
    @ZachariahRabatah: I have to wonder if you even read [the documentation](https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-sleep) because it specifically says "To increase the accuracy of the sleep interval, call the `timeGetDevCaps` function to determine the supported minimum timer resolution and the `timeBeginPeriod` function to set the timer resolution to its minimum. Use caution when calling `timeBeginPeriod`, as frequent calls can significantly affect the system clock, system power usage, and the scheduler..." – Ben Voigt Jan 07 '22 at 22:01
  • Any reason you ain't using https://en.cppreference.com/w/cpp/thread/sleep_for ? – JVApen Jan 08 '22 at 10:33

2 Answers2

2

I need to have the 1ms timing delay in order to interact with hardware I am using

Windows is the wrong tool for this job.

If you insist on using this wrong tool, you are going to have to make compromises (such as using a busy-wait and accepting the corresponding poor battery life).

You can make Sleep() more accurate using timeBeginPeriod(1) but depending on your hardware peripheral's limits on the "one millisecond" delay -- is that a minimum, maximum, or the middle of some range? -- it still will fail to meet your timing requirement with some non-zero probability.

The timeBeginPeriod function requests a minimum resolution for periodic timers.

The right solution for talking to hardware with tight timing tolerances is an embedded microcontroller which talks to the Windows PC through some very flexible interface such as UART or Ethernet, buffers data, and uses hardware timers to generate signals with very well-defined timing.

In some cases, you might be able to use embedded circuitry already existing within your Windows PC, such as "sound card" functionality.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • Thank you for the feedback. I do understand that Windows is not supposed to be the right tool for the job. However, I have been able to make this work correctly using many different Windows 7 machines. My guess is that there is some default difference between Windows 7 and Windows 10. Maybe this difference on Windows 10 has to do with saving power/battery from being overused, so that random programs do not take up battery life or overload the processor. I will try to use the timeBeginPeriod function and see how that works. – Zachariah Rabatah Jan 07 '22 at 19:55
  • Running with timeBeginPeriod(1); does increase the program speed but still about 2 times slower than expected (10 seconds instead of 5 seconds). I will try and see if the hardware I am interacting with can buffer so I don't have to make a call as quickly as I used to. – Zachariah Rabatah Jan 07 '22 at 19:58
  • Ben's comment is spot-on about Windows not being the best platform for programs that need this granularity of timing. That said, it doesn't account for the performance difference you're seeing between 7 and 10. Have you tried this on more than one system? Something just doesn't seem right here... – mzimmers Jan 07 '22 at 20:43
  • @mzimmers: The OS version-dependent performance difference is due to the following note found in the [documentation page of `timeBeginPeriod`](https://learn.microsoft.com/en-us/windows/win32/api/timeapi/nf-timeapi-timebeginperiod): "Starting with Windows 10, version 2004, this function no longer affects global timer resolution. For processes which call this function, Windows uses the lowest value (that is, highest resolution) requested by any process. For processes which have not called this function, Windows does not guarantee a higher resolution than the default system resolution." – Ben Voigt Jan 07 '22 at 21:50
  • 3
    @mzimmers: Apparently, on OP's Windows 7 deployments, **some other process has globally adjusted the timer resolution, and OP's code benefited**. On Windows 10 starting with version 2004, OP's code sees the default quantum of 10ms or 15ms regardless of what other processes are doing. Independence of accidental effects from other processes is generally considered a good thing -- **OP's code can start failing on Windows 7 as well if the set of unrelated background processes changes**. – Ben Voigt Jan 07 '22 at 21:52
  • There could also be a change in the OS to guarantee that the sleep period is at least the requested interval. 10 seconds is the correct amount of time for a sleep function that always rounds up -- no matter how efficient the code between calls to `Sleep(1)`, it takes some time, and sleeping for 0.9999ms is too short even though 0.9999ms is clearly closer to 1.0000 than 1.9999ms is. However, too short behavior IS described in the documentation of `Sleep()`: "If dwMilliseconds is less than the resolution of the system clock, the thread may sleep for less than the specified length of time." – Ben Voigt Jan 07 '22 at 21:56
  • @mzimmers yes this was tested on multiple Windows 7 and multiple Windows 10 machines. The problems always pointed to Windows 10 being the slowdown. It seems that for Windows 11 they are limiting it even more so than before by not guaranteeing resolution than the default system resolution... Not sure if this is to save battery life in machines or less power consumption in machines, but that is my guess as to why they would limit programs. – Zachariah Rabatah Jan 11 '22 at 20:25
  • @BenVoigt I did find out a unique solution to this issue, and while it isn't guaranteed to work perfectly it allows for ~1ms resolution. Since I am polling hardware at that rate to get data my method is more reliable than having the 10ms+ delay that I see with the normal Sleep(1) option. Please see my answer once it is posted for what I was able to figure out to get it working for my situation. – Zachariah Rabatah Jan 11 '22 at 20:30
  • @BenVoigt please see my answer, it should be posted. – Zachariah Rabatah Jan 11 '22 at 21:13
  • Quite a biased against Windows answer and in any case, the OP works on Windows. What would be the "correct" tool for this job? Plus, Windows already has functions better than time* for accuracy and ASIO for low latency sound engineering (Since you mentioned sound card without actually suggesting a solution). – Michael Chourdakis Aug 16 '22 at 07:38
  • @MichaelChourdakis: My answer (which is not "biased against Windows", everything I said is a fact backed up by knowledge and experience) already described the correct tool for the job. You clearly found the mention of sound card circuitry in the final paragraph; try also reading the paragraph before. The question only said "interact with hardware", so of course it's impossible to say whether he actually can use the sound card, and if so, whether to transmit an audio waveform or record one. – Ben Voigt Aug 16 '22 at 15:04
  • Even with timeBeginPeriod(1) you can expect ~500Hz behavior since you do a sleep on a scheduler timeslice, then do some work in the next timeslice. Then you sleep again which will wake up on the next timeslice. So 2 timeslices per piece of work. On Windows 11 indeed, they're downsizing the resolution when your window is not visible (say a fullscreen window is on top). GetProcessInformation() seems related, looking into that. – Ruud van Gaal Feb 03 '23 at 10:19
-1

@BenVoigt & @mzimmers thank you for your responses and suggestions. I did find a unique solution to this question and the solution was inspired by the post I have linked directly below.

Units of QueryPerformanceFrequency

In this post BrianP007 writes a function to see how fast the Sleep(1000) command takes. However, while I was playing around I realized that Sleep() accepts 0. Therefore I used a similar structure to the linked post to find the time that it takes to loop until reaching a delta t of 1ms.

For my purposes I increased i by 100, however it can be increased by 10 or by 1 in order to get a more accurate estimate as to what i should be.

Once you get a value for i, you can use that value to get an approximate delay for 1ms on your machine. If you run this function in a loop (I ran it 100 times) I was able to get anywhere from i = 3000 to i = 6000. However, my machine averages out around 5500. This spread is probably due to jitter/clock frequency changes through time in the processor.

The processor_check() function below only finds out what value should be returned for the for loop argument; the actual 'timer' needs to just have the for loop with Sleep(0) inside of it to run a timer with ~1ms resolution on the machine.

While this method is not perfect, it is much closer and works a ton better than using Sleep(1). I have to test this more thoroughly, but please let me know if this works for you as well. Please feel free to use the code below if you need it for your own applications. This code should be able to be copy and pasted into an empty command prompt C program in Visual Studio directly without modification.

/*ZKR Sleep_ZR()*/

#include "stdio.h"
#include <windows.h>

/*Gets for loop value*/
int processor_check()
{
    double delta_time = 0;
    int i = 0;
    int n = 0;

    while(delta_time < 0.001){
        LARGE_INTEGER sklick, eklick, cpu_khz;

        QueryPerformanceFrequency(&cpu_khz);
        QueryPerformanceCounter(&sklick);
        for(n = 0; n < i; n++){
            Sleep(0);
        }
        QueryPerformanceCounter(&eklick);
        delta_time = (eklick.QuadPart-sklick.QuadPart) / (double)cpu_khz.QuadPart;

        i = i + 100;
    }
    return i;
}

/*Timer*/    
void Sleep_ZR(int cnt)
{
    int i = 0;
    for(i; i < cnt; i++){
        Sleep(0);
    }
}


/*Main*/
int main(int argc, char** argv)
{
    double average = 0;
    int i = 0;
    
    /*Single use*/
    int loop_count = processor_check();
    Sleep_ZR(loop_count);
    
    /*Average based on processor to get more accurate Sleep_ZR*/
    for(i = 0; i < 100; i++){
        loop_count = processor_check();
        average = average + loop_count;
    }
    average = average / 100;
    printf("Average: %f\n", average);

    /*10 second test*/
    for (i = 0; i < 10000; i++){
        Sleep_ZR((int)average);
    }

    return 0;
}
  • 1
    This is going to be extremely sensitive to other tasks running on the system. – Ben Voigt Jan 11 '22 at 21:18
  • You'd be better off writing a busy-wait loop until the desired time has elapsed (as measured by `QueryPerformanceCounter`) and then throw `Sleep(0)` into the body of that busy loop. It would have the same observable behavior (calling `Sleep(0)` "enough" times) but be far more robust to variations in load. – Ben Voigt Jan 11 '22 at 21:22
  • Windows OS does a lot of things in the background even if there are no other user activities like running a web browser. For example, downloading Windows Updates or pinging Microsoft servers to detect changes in the network. This is one of the reasons that Windows is a poor choice for this. – Ben Voigt Jan 11 '22 at 21:26
  • @BenVoigt Yes I agree, however for work I am required to use Windows for this specific application. The hardware I am using should really have a buffer inside of it so that I don't have to access it each millisecond, but this hardware is unique and does not have that functionality available. All the other hardware that I use has the capability to store buffered data, so that I don't have to run so quickly to access the data. – Zachariah Rabatah Jan 11 '22 at 21:32
  • 1
    So use a busy-wait loop, with a termination condition based on `QueryPerformanceCounter`. Just don't expect the number of loop iterations to stay constant -- it will vary based on CPU frequency scaling and also on background tasks that Windows decides to wake up at that moment. Take your equation for `delta_time`, solve it for `eklick`, and use that as the loop termination condition. – Ben Voigt Jan 11 '22 at 21:35
  • And thank you for linking to that QueryPerformanceFrequency question, I wouldn't otherwise have found and down-voted the horribly wrong answer BrianP007 wrote. – Ben Voigt Jan 11 '22 at 21:40