How to implement mmap() to obtain a scattered list of memory regions during runtime?

Question

Typically when willing to let user mode code perform DMA, in device driver code, we would call dma_alloc_coherent() to pre-allocate a chunk of memory at load time (i.e. when loading the kernel module/driver); in other words, at boot time. Then in the mmap() implementation, we could use the kernel logical address obtained from dma_alloc_coherent() to get the page frame number and then pass it to remap_pfn_range().

The above can be considered allocating the DMA buffer at boot time.

What if I would like to allocate the DMA buffer at runtime?

In other words, when user calls mmap(), the user passes in the size of the region, and when it comes to driver code, it will call __get_free_pages() or alloc_pages() to obtain contiguous pages, and then create and add the mapping to the page table.

When creating the mapping, I found one API that handles mapping more than 1 pages - vm_map_pages() (from the answer to this question).

My question is:

Since during runtime, requesting a large number of pages may fail; as a result, instead of only one contiguous chunk of memory, we may end up having a scattered list of memory regions. In such case, the user could access those regions using readv() or writev().

However, in order to let user use readv() and writev(), the user has to know the list of virtual addresses of the start of those regions. How could we obtain those virtual addresses in kernel space?

The vm_area_struct structure has a field vm_next that points to another vm_area_struct. My current implementation is: for each list of pages we obtain (by alloc_pages()) we create a mapping using vm_map_pages(). However, by tracing the code of vm_map_pages(), I did not find it creating new vm_area_struct and append it to the current vm_area_struct. That is why I am confused. If the linked list of vm_area_struct is 1, then how could we obtain the virtual addresses of those memory regions?

How could I obtain the user space's virtual addresses for each scattered mapping?

To avoid complicating the user space code, you could arrange for the buffer to appear contiguous in the process's virtual address space. — Ian Abbott, Mar 07 '22 at 13:30
Memory allocated by `_get_free_pages` or `alloc_pages` (and using the `GFP_DMA` flag), is suitable for streaming DMA mappings but not consistent DMA mappings. — Ian Abbott, Mar 07 '22 at 13:35
@IanAbbott: Thanks for your reply. According to this [reply by Linus](https://lkml.org/lkml/2006/3/16/170), at the bottom, he provides an example about ```__get_free_pages``` and ```vm_insert_page``` in a loop. I tested it in my ```mmap``` implementation in kernel module, but it failed even I tried to allocate 16 pages (the order I passed in is 4) and no matter the flag is ```GFP_DMA``` or ```GFP_KERNEL```. Since that reply is in 2006, may I know if you have any idea on whether the usage has changed since then? — Ethan L., Mar 07 '22 at 16:38
Linus showed using `vm_insert_page` with a single page and with a compound page of order 4, but not with a non-compound allocation of order 4. Also, forget the `GFP_DMA` flag - that was my mistake! — Ian Abbott, Mar 07 '22 at 17:03
@IanAbbott: Thanks! I called ```__get_free_pages``` with the same flags shown in the example - ```GFP_USER | __GFP_COMP```. I also tried ```GFP_KERNEL | __GFP_COMP```, but neither of them worked. The return values of each ```vm_insert_page``` are -14, i.e. ```-EFAULT```. In my opinion, this should not happen if my code is strictly the same as Linus's example. That is why I am really confused. — Ethan L., Mar 07 '22 at 17:25

How to implement mmap() to obtain a scattered list of memory regions during runtime?

0 Answers0