11

Is there any suitable way to get the physical address by the logical one except to walk through page directory entries by hand? I've looked for this functionality in kernel's sources and found that there is a follow_page function that do it well with built-in huge and transparent-huge pages support. But it's not exported to kernel modules (why???)...

So, I don't want to invent the wheel and I think that it's not very good to reimplement the follow_page functionality by hand.

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
Ilya Matveychikov
  • 3,936
  • 2
  • 27
  • 42
  • Why don't you use mmap() and ioremap() to read and write into physical memory? If this is not what you want, can you elaborate your purpose? – Pavan Manjunath Jun 06 '11 at 13:33
  • I've hooked the `page_fault` handler and try to play with user pages on their allocation. So, when exception occurs I need to know exactly physical page address and the size... – Ilya Matveychikov Jun 06 '11 at 13:39
  • 1
    The simplest answer is that there is no simple answer. That's because the very existance / persistance of a physical address for a user virtual address mapping isn't a given; it could be paged out or relocated by e.g. a copy-on-write at any moment. To make it "inspectable", the mapping has to be locked in some fashion, as mentioned e.g. by `ioremap()` or the like, to make it permanent. Even if you figure a point-in-time value out by a pagedir walk, how would you make sure some other kernel activity isn't changing it right after ? – FrankH. Jun 06 '11 at 14:36
  • Well, to introduce some clarity... Imagine, that you can hook the `page_fault` handler and one part of your code runs before `do_page_fault` while another one runs after it. So, as you know it is not possible to get an #PF before `do_page_fault` as interrupts are disabled. As for probability of the just-allocated page to be paged out while we still in exception handler I think that it's very-very theoretical situation and as you mentioned the locking matters. So, with this assumptions, is there a simple way to convert virtual address to physical one? – Ilya Matveychikov Jun 06 '11 at 19:13

3 Answers3

6

Well, it might looks as something like that (follow PTE from an virtual address):

void follow_pte(struct mm_struct * mm, unsigned long address, pte_t * entry)
{
    pgd_t * pgd = pgd_offset(mm, address);

    printk("follow_pte() for %lx\n", address);

    entry->pte = 0;
    if (!pgd_none(*pgd) && !pgd_bad(*pgd)) {
        pud_t * pud = pud_offset(pgd, address);
        struct vm_area_struct * vma = find_vma(mm, address);

        printk(" pgd = %lx\n", pgd_val(*pgd));

        if (pud_none(*pud)) {
            printk("  pud = empty\n");
            return;
        }
        if (pud_huge(*pud) && vma->vm_flags & VM_HUGETLB) {
            entry->pte = pud_val(*pud);
            printk("  pud = huge\n");
            return;
        }

        if (!pud_bad(*pud)) {
            pmd_t * pmd = pmd_offset(pud, address);

            printk("  pud = %lx\n", pud_val(*pud));

            if (pmd_none(*pmd)) {
                printk("   pmd = empty\n");
                return;
            }
            if (pmd_huge(*pmd) && vma->vm_flags & VM_HUGETLB) {
                entry->pte = pmd_val(*pmd);
                printk("   pmd = huge\n");
                return;
            }
            if (pmd_trans_huge(*pmd)) {
                entry->pte = pmd_val(*pmd);
                printk("   pmd = trans_huge\n");
                return;
            }
            if (!pmd_bad(*pmd)) {
                pte_t * pte = pte_offset_map(pmd, address);

                printk("   pmd = %lx\n", pmd_val(*pmd));

                if (!pte_none(*pte)) {
                    entry->pte = pte_val(*pte);
                    printk("    pte = %lx\n", pte_val(*pte));
                } else {
                    printk("    pte = empty\n");
                }
                pte_unmap(pte);
            }
        }
    }
}
Ilya Matveychikov
  • 3,936
  • 2
  • 27
  • 42
  • Could you elaborate a bit on this code? How does it differ from just copying the follow_page code itself into the module? – Nikratio Jul 04 '12 at 21:27
  • Right you are, it's just simplified version of the `follow_page` code. You can try to call `follow_page` directly or copy it's code into the module. – Ilya Matveychikov Jul 05 '12 at 08:22
3

I think you can achieve virtual->physical translation through an indirect method by a combination of /proc/[pid]/maps ( gives the virtual mapping for a process ) and /proc/[pid]/pagemap( Gives Virtual Page to Physical Page mapping for every addressable page ). First, find out the mapping of virtual addresses of your process from maps ( This is done so that you don't search every byte in pagemap ) Then check for the physical mapping of the desired virtual address in pagemap ( pagemap is not in text format. Here is a detailed explantion of the format Pagemap ) This should give you the exact virtual-->physical mapping

Pavan Manjunath
  • 27,404
  • 12
  • 99
  • 125
  • Hmm.. It seems that pagemap interface is not intended to be used in kernel. In addition kernel docs says that: "..pagemap is a new (as of 2.6.25) set of interfaces in the kernel that allow userspace programs to examine the page tables and related information by reading files in /proc...". So it's no suitable to use in kernel. – Ilya Matveychikov Jun 07 '11 at 06:28
  • @Ilya : Ok. Even if you get a method through which you are able to map virtual-->physical address inside the kernel, what will you do with it? For any read/write you anyways need to work with virtual addresses only, as you cannot bypass the MMU. – Pavan Manjunath Jun 07 '11 at 06:35
0

It sounds like you're looking for virt_to_phys.

Gabe
  • 84,912
  • 12
  • 139
  • 238
  • 2
    No. `virt_to_phys` is used for kernel-space addresses, not for user-space. – Ilya Matveychikov Jun 06 '11 at 12:51
  • @Ilya: From reading your question and reading the man page, I don't see why it wouldn't work. What does it return for you? – Gabe Jun 06 '11 at 12:57
  • From the kernel sources that describes the `virt_to_phys` functions I see that "...The returned physical address is the physical (CPU) mapping for the memory address given. It is only valid to use this function on addresses directly mapped or allocated via kmalloc..." – Ilya Matveychikov Jun 06 '11 at 13:03
  • @Ilya: I guess I don't know enough about memory allocation. I know kmalloc can be used to allocate user memory, but maybe there are other ways that this function won't pick up. It might be worth a try, though. – Gabe Jun 06 '11 at 13:54