5

I have a robot running control code with real time priority on a PREEMPT-RT patched Linux OS on a Beaglebone Black. All code is written in C and is running at 500Hz.

I've noticed latency in the range of a few hundred milliseconds every so often when running the code and I've tracked it down to the data logging function I wrote. This latency causes my robot's control to fail as I have a lot depending on the real-time functionality.

The relevant portion of code is below. I've cut a lot of code for clarity, but I'll edit this post if anything is needed.

FILE *file;

int main(int argc, char** argv) {
    file = fopen(logname, "w");

    while (1) {
        /* Control code stuff*/

        logData();

        time_msec = time_msec + controlLoopTime;
    }
}

void logData() {
    if (time_msec - logTimer_msec >= LOG_TIMER) {
        logTimer_msec = time_msec;

        if (!bLogCreated) {
            fprintf(file,
                    "SensorData1 SensorData2 SensorDataN"
                    );
            bLogCreated = TRUE;
        }

        // log data to file
        fprintf(file,

                "%.2f %.2f\n",

                sensorData1, sensorData2, sensorDataN
        );
    }
}

I need to log data from multiple variables (probably 20-50) at a good rate, maybe 100-125Hz. Data doesn't need to be logged at the control rate (every 2ms), but I've decreased it to 12ms and I still see latency spikes every few minutes.

The latency may be an issue with the fprintf call. Is this a limitation of the BeagleBone Black, my code, or just the nature of data logging?

A similar question was asked here but didn't seem to address my issue: Finding latency issues (stalls) in embedded Linux systems

Community
  • 1
  • 1
jekso
  • 81
  • 8
  • 1
    A non-tty file handle is block buffered by default. So after `fprintf`ing 4096 bytes the buffer will be flushed to disk which may cause the latency. Can't you outsource the actual logging to a separate thread? Queue logmessages and dequeue them in your logthread. – Ctx Feb 01 '16 at 23:21
  • Is this code running as part of a normal process? Is it controlling any mutexes, semaphores, etc. which would block a device driver? – wallyk Feb 01 '16 at 23:35
  • @jekso: Create a logging (p)thread and use something like a double linked list as a queue (enqueue at head, dequeue at the end) and pthread_cond_wait/notify to signal new entries. – Ctx Feb 01 '16 at 23:44
  • @wallyk As far as I understand your question, this code isn't blocking any device drivers. It is the only process that I am running that would write to memory. I can post more of the relevant portions of the code if needed. – jekso Feb 01 '16 at 23:56
  • Does the code make any other system-related calls (like open, close, gettimeofday, etc.)? We don't need to see the logic, just its interaction with the system. The obvious question is how does the logger get the data it is logging? – wallyk Feb 02 '16 at 00:01
  • @wallyk It opens the data logging file before the control loop starts and closes it when the control loop ends. It calls `clock_gettime` a few times during the code. It reads from the ADC and GPIO on the Beaglebone, and a couple serial devices. – jekso Feb 02 '16 at 00:16
  • Are there any guarantees on the performance of file IO? I was under the impression that writing to a file could not be done in a hard realtime system under Linux. If there are no guarantees, then you will need to move the logging code to another thread and use some sort of realtime-safe communication channel to communicate between the control loop and the logging thread. – crosstalk Feb 02 '16 at 01:49

1 Answers1

2

Using fprintf is a huge time sink, particularly for R/T logging. Do the logging in binary and write a utility to print it out later.

Instead of:

fprintf(file,"%.2f %.2f %.2f",data1,data2,data3);

Do:

fwrite(&data1,sizeof(double),1,file);
fwrite(&data2,sizeof(double),1,file);
fwrite(&data3,sizeof(double),1,file);

Even better:

struct data {
    double data1;
    double data2;
    double data3;
    time_t event_time;
    ...
};

struct data data;

fwrite(&data,sizeof(struct data),1,file);

If it's still too slow, append the struct to a ring queue and have a separate thread write out the entries.

If the disk write can't keep up with the [now] binary data, maintain the ring queue and only dump out the queue post-mortem if you detect a fatal error


Also, consider using mmap to access the file when writing. See my answer [with benchmarks] here: read line by line in the most efficient way *platform specific*

Community
  • 1
  • 1
Craig Estey
  • 30,627
  • 4
  • 24
  • 48
  • This won't address the underlying issue but only decrease its occurrences perhaps by a factor of 4 or 10, but it will still occur. A simpler mechanism to do the same thing is calling [`setvbuf()`](http://man7.org/linux/man-pages/man3/setbuf.3.html) with a large buffer (like 16 megabytes) allocated. – wallyk Feb 01 '16 at 23:40
  • Thank you for the response. So `fwrite` would inherently be faster, even though it's still writing to a file in the middle of my real time process? Edit: I just saw wallyk's comment above and that is what I was concerned about. I've decreased the amount of data being written a few times and it only decreased the frequency of the latency spikes, but didn't eliminate them altogether. – jekso Feb 01 '16 at 23:49
  • Yes. `fprintf` is _very_ slow (e.g. 100x slower), particularly with floating point. I've shipped production R/T systems with similar logging requirements, so I'm speaking from experience here. If, after doing the binary write to file change, if it's still too slow, write the data to a memory ring buffer and dump out the last N entries when you detect an error – Craig Estey Feb 01 '16 at 23:54
  • @jekso: What hardware is this running on? Does it has floating point hardware? – wallyk Feb 02 '16 at 00:03
  • @wallyk It's not the disk write. It's the amount of time an `fprintf` takes. – Craig Estey Feb 02 '16 at 00:04
  • @wallyk It's running on a Beaglebone Black, which has this ARM processor http://www.ti.com/product/am3358 – jekso Feb 02 '16 at 00:05