1

I am re-implementing mmap in a device driver for DMA.

I saw this question: Linux Driver: mmap() kernel buffer to userspace without using nopage that has an answer using vm_insert_page() to map one page at a time; hence, for multiple pages, needed to execute in a loop. Is there another API that handles this?

Previously I used dma_alloc_coherent to allocate a chunk of memory for DMA and used remap_pfn_range to build a page table that associates process's virtual memory to physical memory.

Now I would like to allocate a much larger chunk of memory using __get_free_pages with order greater than 1. I am not sure how to build page table in that case. The reason is as follows:

I checked the book Linux Device Drivers and noticed the following:

Background:

When a user-space process calls mmap to map device memory into its address space, the system responds by creating a new VMA to represent that mapping. A driver that supports mmap (and, thus, that implements the mmap method) needs to help that process by completing the initialization of that VMA.

Problem with remap_pfn_range:

remap_pfn_range won’t allow you to remap conventional addresses, which include the ones you obtain by calling get_free_page. Instead, it maps in the zero page. Everything appears to work, with the exception that the process sees private, zero-filled pages rather than the remapped RAM that it was hoping for.

The corresponding implementation using get_free_pages with order 0, i.e. only 1 page in scullp device driver:

The mmap method is disabled for a scullp device if the allocation order is greater than zero, because nopage deals with single pages rather than clusters of pages. scullp simply does not know how to properly manage reference counts for pages that are part of higher-order allocations.

May I know if there is a way to create VMA for pages obtained using __get_free_pages with order greater than 1?

I checked Linux source code and noticed there are some drivers re-implementing struct dma_map_ops->alloc() and struct dma_map_ops->map_page(). May I know if this is the correct way to do it?

Ethan L.
  • 395
  • 2
  • 8

1 Answers1

0

I think I got the answer to my question. Feel free to correct me if I am wrong.

I happened to see this patch: mm: Introduce new vm_map_pages() and vm_map_pages_zero() API while I was googling for vm_insert_page.

Previouly drivers have their own way of mapping range of kernel pages/memory into user vma and this was done by invoking vm_insert_page() within a loop.

As this pattern is common across different drivers, it can be generalized by creating new functions and use it across the drivers.

vm_map_pages() is the API which could be used to mapped kernel memory/pages in drivers which has considered vm_pgoff.

After reading it, I knew I found what I want.

That function also could be found in Linux Kernel Core API Documentation.

As for the difference between remap_pfn_range() and vm_insert_page() which requires a loop for a list of contiguous pages, I found this answer to this question extremely helpful, in which it includes a link to explanation by Linus.

As a side note, this patch mm: Introduce new vm_insert_range and vm_insert_range_buggy API indicates that the earlier version of vm_map_pages() was vm_insert_range(), but we should stick to vm_map_pages(), since under the hood vm_map_pages() calls vm_insert_range().

Ethan L.
  • 395
  • 2
  • 8