6

Currently I'm developing some research-related programs and I need to find the pte of some specific addresses. My development environment is Juno r1 board (CPUs are A53 and A57 ) and it's running arm64 Linux kernel.

I use some typical page table walk codes like this:

int find_physical_pte(void *addr)
{
    pgd_t *pgd;
    pud_t *pud;
    pmd_t *pmd;
    pte_t *ptep;
    unsigned long long address;

    address = (unsigned long long)addr;

    pgd = pgd_offset(current->mm, address);
    printk(KERN_INFO "\npgd is: %p\n", (void *)pgd);
    printk(KERN_INFO "pgd value: %llx\n", *pgd);
    if (pgd_none(*pgd) || pgd_bad(*pgd)) 
        return -1;

    pud = pud_offset(pgd, address);
    printk(KERN_INFO "\npud is: %p\n", (void *)pud);
    printk(KERN_INFO "pud value: %llx\n", (*pud).pgd);
    if (pud_none(*pud) || pud_bad(*pud))
        return -2;

    pmd = pmd_offset(pud, address);
    printk(KERN_INFO "\npmd is: %p\n", (void *)pmd);
    printk(KERN_INFO "pmd value: %llx\n",*pmd);
    if (pmd_none(*pmd) || pmd_bad(*pmd))
        return -3;

    ptep = pte_offset_kernel(pmd, address);
    printk(KERN_INFO "\npte is: %p\n", (void *)ptep);
    printk(KERN_INFO "pte value: %llx\n",*ptep);
    if (!ptep)
        return -4;

    return 1;
}

However, when the program checks the pte for the address(0xffffffc0008b2000), it always returns an empty pmd.

My guess is that I got the wrong pgd in the first step. I saw Tims Notes said that using current->mm only could get the pgd of TTBR0 (user space pgd) while the address I checked is a kernel space address so I should try to get the pgd of TTBR1.

So my question is: If I want to get the pte of a kernel space address, can I use current->mm to get the pgd?

If I can't, is there anything else I could try instead?

Any suggestion is welcome! Thank you.

Simon

S.Wan
  • 396
  • 3
  • 18
  • Write a routine that uses `TTBCR` and returns either `TTBR0` or `TTBR1` based on the target address. This is better than `current->mm`, but you are dealing with physical ARM PTE values as opposed to the Linux variants. TTBR1 is used for kernel space (in newer Linux versions ~3.xx+) as it never changes on a user space context switch. Note: Linux armv8 uses EL0 for TTBR0 and EL1 for TTBR1. There is also the CP15 query `unsigned int pa; asm("\t mcr p15, 0, %0, c7, c8, 2\n" "\t isb\n" "\t mrc p15, 0, %0, c7, c4, 0\n" : "=r" (pa) : "0" (0xffff0000));` for physical addresses. – artless noise Mar 23 '17 at 13:41
  • If the address is in kernel space, you can use pgd_offset_k(address) – Igor Stoppa Mar 10 '22 at 18:32

2 Answers2

7

I finally solved the problem.

Actually, my code is correct. The only part I missed is a page table entry check.

According to the page table design of ARMv8, ARM uses 4 levels page table for 4kb granule case. Each level (level 0-3 defined in the link) is implemented as pgd, pud, pmd, and ptep in Linux code.

In the ARM architecture, each level can be either block entry or the table entry (see the AArch64 Descriptor Format Section in the link).

If the memory address belongs to a 4kb table entry, then it needs to be traced down till level 3 entry (ptep). However, for the address belongs to a larger chunk, the corresponding table entry may save in the pgd, pud, or pmd level.

By checking the last 2 bits of the entry in each level, you know it's block entry or not and you only keep tracing down for the block entry.

Here is how to improve my code above:

Retrieving the descriptor based on the page table pointer desc = *pgd and then checking the last 2 bits of the descriptor.

If the descriptor is a block entry (0x01) then you need to extract the lower level entry as my code shows above. If you already get the table entry (0x11) at any level, then you can stop there and translate the VA to PA based on the descriptor desc you just get.

int find_physical_pte(void *addr)
{
    pgd_t *pgd;
    pud_t *pud;
    pmd_t *pmd;
    pte_t *ptep;
    unsigned long long address;

    address = (unsigned long long)addr;

    pgd = pgd_offset(current->mm, address);
    printk(KERN_INFO "\npgd is: %p\n", (void *)pgd);
    printk(KERN_INFO "pgd value: %llx\n", *pgd);
    if (pgd_none(*pgd) || pgd_bad(*pgd)) 
        return -1;
    //check if (*pgd) is a table entry. Exit here if you get the table entry.

    pud = pud_offset(pgd, address);
    printk(KERN_INFO "\npud is: %p\n", (void *)pud);
    printk(KERN_INFO "pud value: %llx\n", (*pud).pgd);
    if (pud_none(*pud) || pud_bad(*pud))
        return -2;
    //check if (*pud) is a table entry. Exit here if you get the table entry.   

    pmd = pmd_offset(pud, address);
    printk(KERN_INFO "\npmd is: %p\n", (void *)pmd);
    printk(KERN_INFO "pmd value: %llx\n",*pmd);
    if (pmd_none(*pmd) || pmd_bad(*pmd))
        return -3;
    //check if (*pmd) is a table entry. Exit here if you get the table entry.

    ptep = pte_offset_kernel(pmd, address);
    printk(KERN_INFO "\npte is: %p\n", (void *)ptep);
    printk(KERN_INFO "pte value: %llx\n",*ptep);
    if (!ptep)
        return -4;

    return 1;
}
Capybara
  • 1,313
  • 8
  • 12
S.Wan
  • 396
  • 3
  • 18
  • Would you mind going into further depth about the solution? What do you mean by belongs to pdg or pmd directly? Or would you mind posting the modified code? – Archmede Jun 27 '18 at 20:55
  • 1
    @Archmede Hi I just update my answer with the detailed explanation about the page table concept in ARMv8-A architecture. Note that if you are using x86 or other architecture then the case may be different. I'm sorry that I cannot share the code since it's combined with several tasks together so the current version of my code cannot help you understand the page table walk of ARM. Please read the link I put in the answer first and then read my answer. Hope this help. – S.Wan Jun 28 '18 at 01:37
  • The code, as it is written, will not compile anymore. It should have one more of those page table layer inspection blocks: nowadays, between pgd and pud, there is p4d. This code is formally needed, even if the real page table does not have the p4d level. The kernel will do its magic to make it disappear, in practice. So the sequence should be: pgd->p4d->pud->pmd->pte – Igor Stoppa Mar 25 '22 at 08:51
1

I think the problem you are having is that you are passing the struct mm_struct * pointer of the current process. But the address you are passing if from the kernel virtual address space. You need to pass the mm pointer to the init process (&init_mm):

pgd = pgd_offset(&init_mm, address);

I think the rest should be fine, but I haven't tested it. You can also look at how it is done in the kernel in the file arch/arm64/mm/dump.c

Jay Medina
  • 544
  • 5
  • 12
  • Hi Jay, actually from kernel perspective current->mm is the same as init_mm I think. Anyway, thank you for your help. – S.Wan Jul 05 '17 at 21:11