I am writing a stress test that will issue many calls to a remote server. I want to collect the following statistics after the test:
- Latency (in milliseconds) of the remote call.
- Number of operations per second that the remote server can handle.
I can successfully get (2), but I am having problems with (1). My current implementation is very similar to the one shown in this other SO question. And I have the same problem described in that question: latency reported by using System.currentTimeMillis()
is longer than expected when the test is run with multiple threads.
I analyzed the problem and I am positive the problem comes from the thread interleaving (see my answer to the other question that I linked above for details), and that System.currentTimeMillis()
is not the way to solve this problem.
It seems that I should be able to do it using java.lang.management
, which has some interesting methods like:
ThreadMXBean.getCurrentThreadCpuTime()
ThreadMXBean.getCurrentThreadUserTime()
ThreadInfo.getWaitedTime()
ThreadInfo.getBlockedTime()
My problem is that even though I have read the API, it is still unclear to me which of these methods will give me what I want. In the context of the other SO question that I linked, this is what I need:
long start_time = **rightMethodToCall()**;
result = restTemplate.getForObject("Some URL",String.class);
long difference = (**rightMethodToCall()** - start_time);
So that the difference
gives me a very good approximation of the time that the remote call took, even in a multi-threaded environment.
Restriction: I'd like to avoid protecting that block of code with a synchronized
block because my program has other threads that I would like to allow to continue executing.
EDIT: Providing more info.:
The issue is this: I want to time the remote call, and just the remote call. If I use System.currentTimeMillis
or System.nanoTime()
, AND if I have more threads than cores, then it is possible that I could have this thread interleaving:
- Thread1: long start_time ...
- Thread1: result = ...
- Thread2: long start_time ...
- Thread2: result = ...
- Thread2: long difference ...
- Thread1: long difference ...
If that happens, then the difference calculated by Thread2 is correct, but the one calculated by Thread1 is incorrect (it would be greater than it should be). In other words, for the measurement of the difference in Thread1, I would like to exclude the time of lines 4 and 5. Is this time that the thread was WAITING?
Summarizing question in a different way in case it helps other people understand it better (this quote is how @jason-c put it in his comment.):
[I am] attempting to time the remote call, but running the test with multiple threads just to increase testing volume.