Linux Driver: mmap() kernel buffer to userspace without using nopage

Question

I'm implementing a Linux Device Driver for a data acquisition device which constantly streams data into a circular buffer I've allocated in the kernel (using __get_free_pages()). The circular buffer (which is written to by PCIe hardware) resides in RAM, and I want userspace to be able to mmap() that RAM region so that userspace may read its contents.

According to LDD3:

An interesting limitation of remap_pfn_range is that it gives access only to reserved pages and physical addresses above the top of physical memory. ... Therefore, remap_pfn_range won’t allow you to remap conventional addresses, which include the ones you obtain by calling get_free_page. ... The way to map real RAM to user space is to use vm_ops->nopage to deal with page faults one at a time.

In my case, I know exactly what addresses will need to be mapped to the given VMA locations for the entirety of the buffer at the moment mmap() is called, so why do I have to use the nopage() approach of faulting in the pages one at a time as they're accessed?

Why can't I just set up my VMA so that the entirety of my ring buffer is mapped into the user's address space immediately? Is there a way to do this?

I also expect that the userspace program will access my buffer sequentially, resulting in a performance hit when my nopage() function is invoked every time a page boundary is crossed. Does this cause a considerable performance hit in practice? (My buffer is large, say 16 MB.)

(Notably, I've used remap_pfn_range() on memory returned from __get_free_pages() in one of my previous device drivers and never had any issues, but I may have just gotten lucky on that system.)

I'm confused. Are you asking how to use `mmap()` to *read* from your device to how to use it to *write* the data which your driver collected in a ring buffer so user space code can access it? — Aaron Digulla, Jan 14 '14 at 14:17
@AaronDigulla: Good point. I've edited the question to clarify. Thanks. — jeremytrimble, Jan 14 '14 at 15:17
@BenVoigt: Although this question is related, it is not a duplicate of that question. — jeremytrimble, Jan 14 '14 at 16:47
@jeremytrimble: It's the same question, even if not all the existing answers are useful to you. You should add your answer there. — Ben Voigt, Jan 14 '14 at 17:07
It's not even close to a duplicate - the link doesn't even mention `get_free_pages`. This question is about the interaction of `get_free_pages` and `remap_pfn_range`. — EML, Sep 20 '15 at 15:47

score 3 · Answer 1 · answered Jan 14 '14 at 16:45

After a little more research, it looks like LDD3's statement is outdated, as per a (slightly more recent) LWN:

http://lwn.net/Articles/162860/ -- "The evolution of driver page remapping"

TL;DR: In the past, drivers may have manually set PG_reserved on pages allocated by kmalloc()/__get_free_pages() and subsequently used remap_pfn_range(), but now drivers should now use vm_insert_page() to do the equivalent thing.

vm_insert_page() apparently only works on order-0 (single-page) allocations, so if you want to allocate N pages you'll have to call vm_insert_page() N times.

An example of this usage can be seen in the Firewire driver: drivers/firewire/core-iso.c

Note how single pages are allocated by repeatedly calling alloc_page() in fw_iso_buffer_alloc(), and these pages are later mapped into a userspace VMA by repeatedly calling vm_insert_page() in fw_iso_buffer_map_vma(). (fw_iso_buffer_map_vma() is called by the mmap handler in drivers/firewire/core-cdev.c.)

Didn't know about `PG_reserved`. However, you can call `remap_pfn_range` and lock down the pages if you have a `vm_area_struct` and set `VM_RESERVED` in `vm_flags`. There are several ways to get the memory first - `pci_alloc_consistent` > `virt_to_phys` > set `VM_RESERVED` > `remap_pfn_range`, for example. — EML, Sep 20 '15 at 15:53

Linux Driver: mmap() kernel buffer to userspace without using nopage

1 Answers1

Linked