0

Consider this code:

#include <iostream>
#include <vector>
#include <functional>
#include <map>
#include <atomic>
#include <memory>
#include <chrono>
#include <thread>
#include <boost/asio.hpp>
#include <boost/thread.hpp>
#include <boost/asio/high_resolution_timer.hpp>

static const uint32_t FREQUENCY = 5000; // Hz
static const uint32_t MKSEC_IN_SEC = 1000000;

std::chrono::microseconds timeout(MKSEC_IN_SEC / FREQUENCY);
boost::asio::io_service ioservice;
boost::asio::high_resolution_timer timer(ioservice);

static std::chrono::system_clock::time_point lastCallTime = std::chrono::high_resolution_clock::now();
static uint64_t deviationSum = 0;
static uint64_t deviationMin = 100000000;
static uint64_t deviationMax = 0;
static uint32_t counter = 0;

void timerCallback(const boost::system::error_code &err) {
  auto actualTimeout = std::chrono::high_resolution_clock::now() - lastCallTime;
  std::chrono::microseconds actualTimeoutMkSec = std::chrono::duration_cast<std::chrono::microseconds>(actualTimeout);
  long timeoutDeviation = actualTimeoutMkSec.count() - timeout.count();
  deviationSum += abs(timeoutDeviation);
  if(abs(timeoutDeviation) > deviationMax) {
    deviationMax = abs(timeoutDeviation);
  } else if(abs(timeoutDeviation) < deviationMin) {
    deviationMin = abs(timeoutDeviation);
  }

  ++counter;
  //std::cout << "Actual timeout: " << actualTimeoutMkSec.count() << "\t\tDeviation: " << timeoutDeviation << "\t\tCounter: " << counter << std::endl;

  timer.expires_from_now(timeout);
  timer.async_wait(timerCallback);
  lastCallTime = std::chrono::high_resolution_clock::now();
}

using namespace std::chrono_literals;

int main() {
  std::cout << "Frequency: " << FREQUENCY << " Hz" << std::endl;
  std::cout << "Callback should be called each: " << timeout.count() << " mkSec" << std::endl;
  std::cout << std::endl;

  ioservice.reset();
  timer.expires_from_now(timeout);
  timer.async_wait(timerCallback);
  lastCallTime = std::chrono::high_resolution_clock::now();
  auto thread = new std::thread([&] { ioservice.run(); });
  std::this_thread::sleep_for(1s);

  std::cout << std::endl << "Messages posted: " << counter << std::endl;
  std::cout << "Frequency deviation: " << FREQUENCY - counter << std::endl;
  std::cout << "Min timeout deviation: " << deviationMin << std::endl;
  std::cout << "Max timeout deviation: " << deviationMax << std::endl;
  std::cout << "Avg timeout deviation: " << deviationSum / counter << std::endl;

  return 0;
}

It runs timer to call timerCallback(..) periodically with specified frequency. In this example, callback must be called 5000 times per second. One can play with frequency and see that actual (measured) frequency of calls is different from desired one. In fact the higher is the frequency, the higher is deviation. I did some measurements with different frequencies and here is summary: https://docs.google.com/spreadsheets/d/1SQtg2slNv-9VPdgS0RD4yKRnyDK1ijKrjVz7BBMSg24/edit?usp=sharing

When desired frequency is 10000Hz, system miss 10% (~ 1000) of calls. When desired frequency is 100000Hz, system miss 40% (~ 40000) of calls.

Question: Is it possible to achieve better accuracy in Linux \ C ++ environment? How? I need it to work without significant deviation with frequency of 500000Hz

P.S. My first idea was that it is the body of the timerCallback(..) method itself causes delay. I measured it. It takes a stably takes less than 1 microsecond to execute. So it does not affect the process.

mc.dev
  • 2,675
  • 3
  • 21
  • 27
  • Don't use `std::chrono::high_resolution_clock` or `std::chrono::system_clock` use `std::chrono::steady_clock`. – Galik Oct 07 '17 at 20:51
  • Just tried it. Deviation is even higher than high_resolution_clock gives. – mc.dev Oct 07 '17 at 21:01
  • High Resolution Clock is just a typedef of the steady clock. Check out. [This video](https://www.youtube.com/watch?v=P32hvk8b13M&t=1s) for more information on the reason you should use steady_clock over high resolution as well as more good information on chrono. – Carl Oct 07 '17 at 21:08
  • If you want to be called a particular number of times per second, why would you do this: `timer.expires_from_now(timeout);`? That means that if one call is a microsecond late, you'll set the next call one microsecond late. That's the wrong thing to do if you have a target frequency. – David Schwartz Oct 07 '17 at 21:18
  • @DavidSchwartz, it's absolutely correct. As I mentioned in P.S. I've measured execution time of timerCallback(). It's less than microsecond. I believe it can't cause 40% deviation in frequency. – mc.dev Oct 07 '17 at 21:49
  • 1
    @mc.android.developer I suspect that the way you are measuring it is invalid. For example, does it take into account the fact that all the caches will be cold? That you may be waking the CPU from sleep? – David Schwartz Oct 07 '17 at 23:31
  • @DavidSchwartz I don't know much about caches. Would be great if you can tell more about it or reference some materials. Basically the way I've been measuring the time of execution of method body is quite the same. I save current time in the very beginning of method and calculated the difference in the very end. I even tried to use this value to correct the timeout in the next call to timer.expires_from_now(timeout) but it didn't improve the situation. – mc.dev Oct 10 '17 at 00:47
  • @mc.android.developer Most of the costs are incurred before the beginning and after the end, before your code starts running and after it finishes. The CPU registers have to be restored, the page tables set correctly, instructions have to be fetched from RAM, and so on. – David Schwartz Oct 10 '17 at 01:07
  • Hm. I can see cold caches influencing the first call to timerCallback(). In fact the first call timeout is usually noticeably longer. But it should not influence subsequent calls happening frequently, because all the caches are set by that time, aren't they? – mc.dev Oct 10 '17 at 18:36
  • What do you mean with _without signifiant deviation with frequency of 500000Hz_? you mean a deviation of 1usec, for example? or you can deviate 1 full second? what is signifiant for you? You need the callback calls equally spaced or they have to match the corresponding usec tick offset? Have you an idea of what you are asking for? – Luis Colorado Oct 11 '17 at 13:57
  • By the way, is your clock properly synchronized with a source that can achieve such a good timing source? – Luis Colorado Oct 11 '17 at 13:58
  • @LuisColorado 1. That means that timerCallback() need to be called exactly 500000 times within time period of 1 sec with equal timeouts between calls; 2. Significant deviation is more than 10%; 3. As equally spaced as possible; 4. I think I do; 5. I haven't done any specific synchronization. It is what it is in Ubuntu Linux on x86 machine by default. – mc.dev Oct 17 '17 at 19:22
  • @mc.android.developer, you mean 10% over the whole second (this means a call can be offset 0.1s in time) or 10% over the average time gap between calls... (which means 0.2us offset) Anyway, you are overrequiring an embedded device if that is your requirements. Probably you have better to hardware provide such a service. – Luis Colorado Oct 18 '17 at 05:50
  • @mc.android.developer, your assertion _is absolutely correct_ is, at least, a little prepotent. Don't ask questions in SO if you are so sure of how well you do things. In a heavy loaded system (as yours will be, if you try to do some process at microsecond pace) measuring time delays from inside the system has a quantum mechanics uncertainty problem, the measuring device is affected by the thing you want to measure, and gives incorrect results. – Luis Colorado Oct 18 '17 at 06:13
  • @LuisColorado 1. 10% over the whole second; 2. I didn't mention embedded device. I'm trying to make it work on regular PC with Intel Core i7-3632QM, min/max: 1200/3200 MHz, 6144 KB cache; Do you think it isn't good? What hardware is sufficient to do that? – mc.dev Oct 19 '17 at 00:45
  • @mc.android.developer sorry firts as I deduced you're working on android from your nickname.... second.... I think the trick will be in the reduction of free time you win by spreading the load between threads... this mean you need cores to run those threads.... My recomendation is to use as many cores as you can, the application (I suppose) is not memory intensive, so you'll get the maximum throughput if you prepare as many threads as your cpu has (I don't know from your reference, but can be eight?) then you get the maximum time for processing between timed calls. – Luis Colorado Oct 20 '17 at 08:50
  • @mc.android.developer, also, if you are going to do disk intensive management (e.g. storage), perhaps you can start a new thread for each timing slot and let it do the storage intensive tasks without stopping the timing processes. You'll have to fine tune your application before it works well. – Luis Colorado Oct 20 '17 at 08:53

2 Answers2

1

I have no experience in this problem myself, but I guess (as the references explains) that the scheduler of the OS interferes with your callback somehow. So, you could try to use the real-time scheduler and try to change priority of your task to a higher one.

Hope this gives you a direction to find your answer.

Scheduler: http://gumstix.8.x6.nabble.com/High-resolution-periodic-task-on-overo-td4968642.html

Priority: https://linux.die.net/man/3/setpriority

Dangraf
  • 89
  • 1
  • 7
1

If you need to achieve one call each two microsecond interval, you'd better to attach to absolute time positions, and don't consider the time each request is going to require.... You run although into the problem that the processing required at each timeslot could be more cpu demanding than the time required for it to execute.

If you have a multicore cpu, I'd divide the timeslot between each core (in a multithreaded approach) for it to be longer for each core, so suppose that you have your requirements in a four core cpu, then you can allow each thread to execute 1 cal per 8usec, which is probably more affordable. In this case you use absolute timers (one absolute timer is one that waits until the wall clock ticks a specific absolute time, and not a delay from the time you called it) and will offset them by an amount equal to the thread number of 2usec delay, in this case (4 cores) you will start thread #1 at time T, thread #2 at time T + 2usec, thread #3 at time T + 4usec, ... and thread #N at time T + 2*(N-1)usec. Each thread will then start itself again at time oldT + 2usec, instead of doing some kind of nsleep(3) call. This will not accumulate the processing time to the delay call, as this is most probably what you are experiencing. The pthread library timers are all absolute time timers, so you can use them. I think this is the only way you'll be capable of reaching such a hard spec. (and prepare to see how the battery suffers with that, assuming you're in an android environment)

NOTE

in this approach, the external bus can be a bottleneck, so even if you get it working, probably it would be better to synchronize several machines with NTP (this can be done to the usec level, at the speed of actual GBit links) and use different processors running in parallel. As you don't describe anything of the process you have to repeat so densely, I cannot provide more help to the problem.

Community
  • 1
  • 1
Luis Colorado
  • 10,974
  • 1
  • 16
  • 31