I'm implementing a Linux Device Driver for a data acquisition device which constantly streams data into a circular buffer I've allocated in the kernel (using __get_free_pages()
). The circular buffer (which is written to by PCIe hardware) resides in RAM, and I want userspace to be able to mmap() that RAM region so that userspace may read its contents.
According to LDD3:
An interesting limitation of remap_pfn_range is that it gives access only to reserved pages and physical addresses above the top of physical memory. ... Therefore, remap_pfn_range won’t allow you to remap conventional addresses, which include the ones you obtain by calling get_free_page. ... The way to map real RAM to user space is to use
vm_ops->nopage
to deal with page faults one at a time.
In my case, I know exactly what addresses will need to be mapped to the given VMA locations for the entirety of the buffer at the moment mmap()
is called, so why do I have to use the nopage()
approach of faulting in the pages one at a time as they're accessed?
Why can't I just set up my VMA so that the entirety of my ring buffer is mapped into the user's address space immediately? Is there a way to do this?
I also expect that the userspace program will access my buffer sequentially, resulting in a performance hit when my nopage()
function is invoked every time a page boundary is crossed. Does this cause a considerable performance hit in practice? (My buffer is large, say 16 MB.)
(Notably, I've used remap_pfn_range()
on memory returned from __get_free_pages()
in one of my previous device drivers and never had any issues, but I may have just gotten lucky on that system.)