5

Trying to use cacheable mapped buffers in linux user space. These buffers will be accessed by the accelerators. In ARMv7-A architecture, is there any possibility to flush/invalidate data cache explicitly from linux user space?

Tried __clear_cache(), it didnt work. As per URL https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html , my understanding is that it flushes only instruction cache.

user space applications run in user mode, do we need to set any privileged mode permissions for cache operations.

More info will be helpful.

  • What are "the accelerators" - are you referring to DMA capable devices? If that is the case, how have you ended up with a cacheable mapping of a buffer for a non-coherent device anyway? – Notlikethat Dec 16 '15 at 09:32
  • If I use pgprot_noncached(), it is fine. To improve performance, I want to try using cacheable buffers. Not sure if it is possible or not – Jyothi Vemulapalli Dec 16 '15 at 10:23
  • I think its non-coherent device only, that's why I want to do explicit cache flush and invalidate operations. – Jyothi Vemulapalli Dec 16 '15 at 11:49
  • 1
    Possible duplicate of [How clear and invalidate ARM v7 processor cache from User Mode on Linux 2.6.35](http://stackoverflow.com/questions/6046716/how-clear-and-invalidate-arm-v7-processor-cache-from-user-mode-on-linux-2-6-35) – artless noise Dec 16 '15 at 14:06
  • 1
    You already asked [this question](http://stackoverflow.com/questions/34133612/cache-flush-and-invalidate-operations-from-linux-userspace-in-armv7-a) which was a duplicate. I updated [this question](http://stackoverflow.com/questions/22701352/flush-cpu-cache-for-a-region-of-address-space) with some info as I said in the comments in your first question (which was a duplicate). If you read the ARMv7-A TRM, you will see that the CP15 cache register can only be accessed from *super mode*. The only reliable way is a system call. If you found Linux doesn't support it, then there is no way. – artless noise Dec 16 '15 at 14:07
  • This [blog](https://community.arm.com/groups/processors/blog/2010/02/17/caches-and-self-modifying-code) specifically says that the linux syscall flushes the data cache; so either the blog writer is wrong, you have a bad version of Linux/glibc, or your code is wrong. Have you given anyone the information to figure out which one it is? http://stackoverflow.com/questions/22701352/flush-cpu-cache-for-a-region-of-address-space – artless noise Dec 16 '15 at 14:16
  • 1
    If you have devices doing their own DMA, that _has_ to be managed by drivers in the kernel, full stop. Userspace cache maintenance essentially covers making the data/instruction caches coherent, to handle stuff that userspace _can_ do like self-modifying code or JITs. It doesn't cover such fun as outer cache maintenance (which you probably need on Cortex-A9 if you have a PL310 L2 cache) and the other subtleties necessary to manage non-coherent DMA. You're then of course free to implement whatever method you like to communicate between your userspace component and your driver. – Notlikethat Dec 16 '15 at 23:01
  • 1
    Possible duplicate of [How to flush the CPU cache for a region of address space in Linux?](https://stackoverflow.com/questions/22701352/how-to-flush-the-cpu-cache-for-a-region-of-address-space-in-linux) – Ciro Santilli OurBigBook.com Aug 24 '17 at 06:31
  • Possible duplicate of [Cache flush and invalidate operations from linux userspace in ARMv7-A](https://stackoverflow.com/questions/34133612/cache-flush-and-invalidate-operations-from-linux-userspace-in-armv7-a) – Armali Nov 03 '17 at 10:43

2 Answers2

0

There is no way to flush an ARMv7-A/ARMv8-A processor cache from userspace (kernel <= 5.13.x) without writing a kernel driver such as a simple misc class driver that would allow you to do an ioctl or sysfs action that would cause the driver to call the kernel API arch_sync_dma_for_device for the area of RAM that you wish to flush.

See

#include <linux/dma-noncoherent.h>

for the function prototype for arch_sync_dma_for_device.

So unless the logistics of your project allow you to add a kernel module to the system or rebuild and replace the kernel, you can't flush the processor caches from a userspace application. For legacy projects with product in the field, or projects whose kernel version is locked by digital signing the logistics usually do not support this type of invasive solution.

I have successfully demonstrated such a misc driver that flushes the processor caches on an IPQ ARMv8a implementation for a new product design. The driver took me about two hours to write and test.

Jonathan Ben-Avraham
  • 4,615
  • 2
  • 34
  • 37
0

The __builtin___clear_cache function works in my case (Zynq MP, arm64 + linux) but I think it is because I use mmaped memory from a custom linux driver kernel module which allocates a DMA coherent buffer (dma_alloc_coherent).

Edit: Back to this topic, the __builtin___clear_cache function works well in my case on a general /dev/mem mmapped DDR segment. I open the /dev/mem without the O_SYNC flag.