The documentation on the Streaming DMA API mentions that in order to ensure consistency, the cache needs to be flushed before dma-mapping to device, and invalidated after unmapping from device.
However, I confused if the flush and invalidate needs to be performed explicitly, i.e., Do the functions dma_map_single() & dma_sync_single_for_device() already take care of flushing the cachelines, or does the driver develop need to call some function to explicitly flush the cachelines of the dma buffer? Same goes for dma_unmap_single() & dma_sync_single_for_cpu()..do these 2 functions automatically invalidate the dma-buffer cache lines?
I skimmed through some existing drivers that use streaming dma and I can't see any explicit calls to flush or invalidate the cachelines.
I also went through the kernel source code and it seems that the above mentioned functions all 'invalidate' the cachelines in their architecture specific implementations, which further adds to my confusion..e.g., in arch/arm64/mm/cache.S
SYM_FUNC_START_PI(__dma_map_area)
add x1, x0, x1
cmp w2, #DMA_FROM_DEVICE
b.eq __dma_inv_area
b __dma_clean_area
SYM_FUNC_END_PI(__dma_map_area)
Can someone please clarify this? Thanks.