I am working in an embedded Linux environment debugging a highly timing sensitive issue related to the pairing/binding of Zigbee devices.
Our architecture is such that data read from Zigbee Front End Module via SPI interface and then passed from Kernel space to user space for processing. The processed data and response is then passed back to kernel space and clocked out over the SPI interface again.
The Zigbee 802.15.4 timing requirements specifies that we need to respond within 19.5ms and we frequently have situations where we respond just outside of this window which results in a failure and packet loss on the network.
The Linux kernel is not running with pre-emption enabled and it may not be possible to enable preemption either.
My suspicion is that since the kernel is not preemptible there is another task/process which is using the ioctl() interface and this holds off the Zigbee application just long enough that the 19.5ms window is exceeded.
I have tried the following tools
- oprofile - not much help here since it profiles the entire system and the application is not actually very busy during this time since it moves such small amounts of data
- strace - too much overhead, I don't have much experience using it though so maybe the output can be refined. The overhead affects the performance so much that the application does not funciton at all
Are there any other lightweight methods of profiling a system like this?
Is there anyway to catch when an ioctl call is pended on another task/thread? (assuming this is the root cause of the issue)