Introduction:
We have an application in which Linux running on an ARM accepts data from an external processor which DMA's the data into the ARM's memory space. The ARM then needs to access that data from user-mode code.
The range of addresses must be physically contiguous as the DMA engine in the external processor does not support scatter/gather. This memory range is initially allocated from the ARM kernel via a __get_free_pages(GFP_KERNEL | __GFP_DMA,order) call as this assures us that the memory allocated will be physically contiguous. Then a virt_to_phys() call on the returned pointer gives us the physical address that is then provided to the external processor at the beginning of the process.
This physical address is known also to the Linux user mode code which uses it (in user mode) to call the mmap() API to get a user mode pointer to this memory area. Our Linux kernel driver then sees a corresponding call to its mmap routine in the driver's file_operations structure. The driver then retains the vm_area_struct "vma" pointer that is passed to it in the call to its mmap routine for use later.
When the user mode code receives a signal that new data has been DMA'd to this memory address it then needs to access it from user mode via the user mode pointer we got from the call to mmap() mentioned above. Before the user mode code does this of course the cache corresponding to this memory range must be flushed. To accomplish this flush the user mode code calls the driver (via an ioctl) and in kernel mode a call to flush_cache_range() is made:
flush_cache_range(vma,start,end);
The arguments passed to the call above are the "vma" which the driver had captured when its mmap routine was called and "start" and "end" are the user mode addresses passed into the driver from the user mode code in a structure provided to the ioctl() call.
The Problem:
What we see is that the buffer does not seem to be getting flushed as we are seeing what appears to be stale data when accesses from user mode are made. As a test rather than getting the user mode address from a mmap() call to our driver we instead call the mmap() API to /dev/mem. In this case we get uncached access to the buffer (no flushing needed) and then everything works perfectly.
Our kernel version is 3.8.3 and it's running on an ARM 9. Is there a logical eror in the approach we are attempting?
Thanks!