13

I'm trying to measure the execution time of some bits of code as accurately as possible on a number of threads, taking context switching and thread downtime into account. The application is implemented in C# (VS 2008). Example:

public void ThreadFunc ()
{
    // Some code here

    // Critical block #1 begins here
    long lTimestamp1 = Stopwatch.GetTimestamp ();

    CallComplex3rdPartyFunc (); // A

    long lTimestamp2 = Stopwatch.GetTimestamp ();
    // Critical block #1 ends here

    // Some code here

    // Critical block #2 begins here
    long lTimestamp3 = Stopwatch.GetTimestamp ();

    CallOtherComplex3rdPartyFunc (); // B

    long lTimestamp4 = Stopwatch.GetTimestamp ();
    // Critical block #2 ends here

    // Save timestamps for future analysis.
}

public int Main ( string[] sArgs )
{
    // Some code here

    int nCount = SomeFunc ();

    for ( int i = 0; i < nCount; i++ )
    {
        Thread oThread = new Thread ( ThreadFunc );
        oThread.Start ();
    }

    // Some code here

    return ( 0 );
}

I'd like to measure the execution time of the above two critical code blocks as accurately as possible. The two calls marked as A and B are potentially long function calls that may sometimes take several seconds to execute but in some cases they may complete in a few milliseconds.

I'm running the above code on a number of threads - somewhere between 1 to 200 threads, depending on user input. The computers running this code have 2-16 cores - users use lower thread counts on the weaker machines.

The problem is that A and B are both potenitally long functions so it's very likely that at least one context switch will happen during their execution - possibly more than one. So the code gets lTimestamp1, then another thread starts executing (and the current thread waits). Eventually the current thread gets back control and retrieves lTimestamp2.

This means that the duration between lTimestamp1 and lTimestamp2 includes time when the thread was not actually running - it was waiting to be scheduled again while other threads executed. The tick count, however, increases anyway, so the duration is now really

Code block time = A + B + some time spent in other threads

while I want it to be only

Code block time = A + B

This is especially an issue with a larger number of threads, since they'll all get a chance to run, so the above timings will be higher while all other threads run before the thread in question gets another chance to run.

So my question is: is it possible to somehow calculate the time when the thread is not running and then adjust the above timings accordingly? I'd like to eliminate (subtract) that 3rd term entirely or at least as much of it as possible. The code runs millions of times, so final timings are calculated from a lot of samples and then averaged out.

I'm not looking for profiler products, etc. - the application needs to time these the marked parts as accurately as possible. The functions A and B are 3rd party functions, I cannot change them in any way. I'm also aware of the possible fluctuations when measuring time with nanosecond precision and possible overhead inside those 3rd-party functions, but I still need to do this measurement.

Any advice would be greatly appreciated - C++ or x86 assembly code would work as well.

Edit: seems to be impossible to implement this. Scott's idea below (using GetThreadTimes) is good but unfortunately GetThreadTimes() is a flawed API and it almost never returns correct data. Thanks for all the replies!

xxbbcc
  • 16,930
  • 5
  • 50
  • 83
  • you can't do that except modifying the .NET runtime and collecting the timing there (even then it would be inaccurate!) – Yahia Sep 29 '11 at 14:31
  • 5
    Related: http://stackoverflow.com/questions/3853993/how-much-time-spent-in-a-net-thread – H H Sep 29 '11 at 14:37
  • 3
    I don't know if this is a comment or the solution, but why did you not just profile the code and measure it like it should, I mean alone?. There will be always threading switch (even with the OS!).. – gbianchi Sep 29 '11 at 14:38
  • and you could create proxies (same interface) for the 3rd-party-Functions and measure the time at entry and exit from those functions... depending on the situation - for example if you don't have access to source of the calling application you could do this by hooking or similar – Yahia Sep 29 '11 at 14:40
  • 1
    I don't think there is a way to do this without a profiler. Its also worth mentioning that for cpu bound tasks creating 200 threads on a 16 core machine is counter productive. – Yaur Sep 29 '11 at 14:46
  • @Yahia he is wanting to only measure time when he is executing, i.e. not time when his thread is waiting to be scheduled. – Yaur Sep 29 '11 at 14:48
  • yes - and that is not really possible, what I describe are ways to approximate some timing... – Yahia Sep 29 '11 at 14:55
  • 1
    The thread will only spend time waiting to be scheduled if there are other threads that can be scheduled. Why not just downgrade the priority of every other thread in the system so low that it never runs? Then you are only measuring the time spent in your thread, because it is the only thread in the system actually running. – Eric Lippert Sep 29 '11 at 15:00
  • gbianchi: the application is multithreaded; there's no choice of doing the measurement only on one thread. Also, you didn't seem to read my question completely: I cannot use a profiler because these timestamps must be collected during run-time, all the time for later analysis. I'm not trying to do a one-time measurement. Yahia: I can't use proxies. If I only wanted to measure the time between the two timestamps, the current code already does that. As Yaur said it, I need to subtract the time when the thread is down. Eric: I can't do that; the system will have other threads that need to run. – xxbbcc Sep 29 '11 at 16:08

2 Answers2

12

This can be done with the Native API call GetThreadTimes. Here is a article on CodeProject that uses it.

A second option is use QueryThreadCycleTime. This will not give you the time, but it will give you the number of cycles the current thread has been executing.

Be aware you can't just directly convert cycles->seconds due to the fact that many processors (especially mobile processors) do not run at a fixed speed so there is no constant number you could multiply by to get the elapsed time in seconds. But if you are using a processor that does not vary its speed it then would be a simple math problem to get wall clock time from the cycles.

Scott Chamberlain
  • 124,994
  • 33
  • 282
  • 431
  • He should use that in conjunction with [Thread.BeginThreadAffinity](http://msdn.microsoft.com/en-us/library/system.threading.thread.beginthreadaffinity.aspx) to ensure the code stays on the same physical thread during execution. – user7116 Sep 29 '11 at 15:59
  • Scott and sixlettervariables - thank you, this looks promising. I'll have to read the entire CodeProject article but it seems this is a good lead. – xxbbcc Sep 29 '11 at 16:07
  • 1
    I'm sorry, I have to take away the "Correct answer" mark, even though your answer is the best and closest. Unfortunately, GetThreadTimes() is a useless API - it has a resolution of 15ms and it frequently reports 0 as kernel time and/or user time when threads yield or go into wait states. Looks like the kind of measurement I'm trying to do is impossible, at least on Windows. – xxbbcc Sep 29 '11 at 21:31
  • @xxbbcc I know this is a older question and you likely don't need this any more but take a look at the update I posted. – Scott Chamberlain Jun 14 '13 at 15:15
  • @ScottChamberlain Thank you for the update - I was aware of variable-speed CPUs (this is an issue in virtual machines as well). I was looking for a way to gather frequency information along with timings but even if the information is available (rarely) it's mostly unreliable. I gave up trying to do this as most hardware simply don't report useful performance numbers. – xxbbcc Jun 14 '13 at 15:38
2

You can use Stopwatch.Start() and Stopwatch.Stop() methods to pause/continue time measurement, it does not reset Elapsed/ElapsedMilliseconds value so perhaps you can leverage this.

Regarding thread context switches - I believe there are no ways to handle it in managed code so this is not possible to exclude time when thread was suspended

EDIT:

An interesting article with benchmarks: How long does it take to make a context switch?

sll
  • 61,540
  • 22
  • 104
  • 156
  • 1
    @Yaur : I understand the key point of the question, I believe we do not have technical instruments to handle context switches – sll Sep 29 '11 at 14:44
  • sll: unfortunately I can't do that with just Stopwatch, since I don't know when the OS stops the current thread to switch over to another. So my problem is exactly that the underlying CPU ticks still elapse even when my current thread is not running, but is waiting to be scheduled again. This is obviously not a problem for just the "current" thread of my application - it applies to all threads I create since all those threads try to do the same measurement. – xxbbcc Sep 29 '11 at 16:15