0

The following question is in my case around a Xilinx Zynq SoC, but I think the question is more general than that. It's also a little ill defined, so please bear with me as I learn enough to ask the right question.

I have a situation in which I have data being generated by an FPGA which is DMAd into system memory (into non-contiguous blocks using scatter-gather). Once the DMA is complete, the software then augments the data (puts some networking header information around the blocks), before causing those blocks to be written to a network socket (through a memory copy, though possibly using a mmap if I can wrangle a driver appropriately).

The DMA can be forced to be cache coherent using ARM's Accelerator Coherency Port, but that has potential to compete with CPU resources. For the most part I don't need the data to be cache coherent so I can use a non-cache-coherent bus, but for this to work the CPU needs to know that any cache lines referring to the memory region of interest are invalid.

My question then is broadly around whether I can get the compiler to do the "hard" work: If I label the memory that is being used as the DMA destination as volatile, does that always cause the backing memory to be read or written by the CPU with the cache bypassed? How is this generally achieved at the assembler level?

It's not obvious to me that such a strategy would interact cleanly with a subsequent memory copy. I think the issue is that I don't fully understand what labelling memory as volatile actually does behind the scenes, or equivalently, how one properly manages a cache from user-space programs.

I appreciate that bypassing the cache may not do what I expect, but I'd like to be able to perform the necessary investigations to understand where I can make useful optimisation.

For reference, the architecture is ARMv7-A and the implementation will be Rust, though I'm more than happy for things to be discussed in terms of the behaviour of C volatiles/semantics.

Henry Gomersall
  • 8,434
  • 3
  • 31
  • 54
  • Related questions: https://stackoverflow.com/questions/18695120/volatile-and-cache-behaviour https://stackoverflow.com/questions/18550784/does-volatile-qualifier-cancel-caching-for-this-memory https://stackoverflow.com/questions/7872175/c-volatile-variables-and-cache-memory – Sven Marnach May 11 '20 at 09:06
  • 2
    In short, `volatile` is not related to the hardware cache in any way. It's the job of the hardware to deal with the CPU cache. The compiler can generally optimise away some memory accesses if the value of a variable can be statically determined or is already in a register, and the compiler won't do this optimisation for volatile variables. This is unrelated to the CPU cache, though. – Sven Marnach May 11 '20 at 09:10
  • @SvenMarnach That [last link](https://stackoverflow.com/questions/7872175/c-volatile-variables-and-cache-memory) is the relevant nugget. Thanks! – Henry Gomersall May 11 '20 at 09:32
  • The ARM documentation has some useful info about [cache maintenance](http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka14802.html). – Henry Gomersall May 11 '20 at 09:52
  • @SvenMarnach yes it does. – Henry Gomersall May 11 '20 at 11:49

0 Answers0