0

So I know that kernel has a lowmem region where all the memory is 1:1 mapped (logical mapping) to the physical memory. I am looking for a way to loop through all the kernel "struct page" structures and read its page.

I haven't found anything than a brute force way that crashes when it reaches an upper page and lower page.

Code:

#define PDEBUG(format, args...) printk(KERN_DEBUG "MyKernelModule: " format, ##args)

for(void* ppointer = 0xffff88803e400840;ppointer > 0;ppointer-=64){
    PDEBUG("Page address of %px is %px (zone %px)\n",ppointer,page_address(ppointer),page_zone(ppointer));
}
/* crashes before reaching here */
for(void* ppointer = 0xffff88803e400840;ppointer < 0xffffffffffffffffff;ppointer+=64){
    PDEBUG("Page address of %px is %px (zone %px)\n",ppointer,page_address(ppointer),page_zone(ppointer));
}

Crash:

[30496.218741] BUG: unable to handle page fault for address: ffff887fffffffc0
[30496.221713] #PF: supervisor read access in kernel mode
[30496.223029] #PF: error_code(0x0000) - not-present page
[30496.224422] PGD 0 P4D 0 
[30496.225248] Oops: 0000 [#1] SMP NOPTI
[30496.226244] CPU: 0 PID: 800 Comm: a.out Not tainted 5.4.17+ #109
[30496.227753] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[30496.230141] RIP: 0010:playground_ioctl.cold+0xa3/0xe0
[30496.231434] Code: 45 31 e4 e9 b7 f4 ff ff e8 2a 97 58 ff 48 8b 74 24 10 45 31 e4 ba 0b 00 00 00 48 c1 e2 29 b8 11 ff ff 01 48 c7 c7 48 aa 8d 82 <48> 8b 0e 48 01 f2 48 c1 e0 27 48 c1 fa 0c
[30496.236269] RSP: 0018:ffffc90000633df8 EFLAGS: 00010216
[30496.237633] RAX: 0000000001ffff11 RBX: 0000000000000014 RCX: ffffffff81bf8772
[30496.239460] RDX: 0000160000000000 RSI: ffff887fffffffc0 RDI: ffffffff828daa48
[30496.241326] RBP: 0000000000000000 R08: ffff88803a8c0e00 R09: 0000000000000000
[30496.243150] R10: ffffffff813b5b40 R11: ffff88803e407d00 R12: 0000000000000000
[30496.245162] R13: 0000000000000014 R14: ffff88803cca3100 R15: 0000000000002000
[30496.247051] FS:  00007ffff7fef700(0000) GS:ffff88803ea00000(0000) knlGS:0000000000000000
[30496.249286] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[30496.250740] CR2: ffff887fffffffc0 CR3: 000000003db46000 CR4: 00000000000406f0
[30496.252690] Call Trace:
[30496.253418]  ? get_all_pages+0x60/0x60
[30496.254409]  do_vfs_ioctl+0x788/0x9e0
[30496.255377]  ksys_ioctl+0x9b/0xc0
[30496.256372]  __x64_sys_ioctl+0x1a/0x20
[30496.257324]  do_syscall_64+0x75/0x1f0
[30496.258249]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[30496.259512] RIP: 0033:0x7ffff7918017
[30496.260453] Code: 00 00 00 48 8b 05 81 7e 2b 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 51 78
[30496.265208] RSP: 002b:00007fffffffebd8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
[30496.267082] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ffff7918017
[30496.268973] RDX: 00007fffffffebf0 RSI: 0000000000000014 RDI: 0000000000000003
[30496.270904] RBP: 00007fffffffec10 R08: 0000555555554980 R09: 00007ffff7de8ba0
[30496.272794] R10: 0000000000000016 R11: 0000000000000202 R12: 0000555555554740
[30496.274636] R13: 00007fffffffed10 R14: 0000000000000000 R15: 0000000000000000
[30496.276542] Modules linked in:
[30496.277345] CR2: ffff887fffffffc0
[30496.278204] ---[ end trace 4e40cd27e9ea1f35 ]---

Is there a better way to simply enumerate all "struct page" without crashing?

anon
  • 188
  • 1
  • 13
  • Are you doing this in a kernel module or what? Sharing the actual source code of what you're trying wouldn't hurt. – Marco Bonelli Sep 20 '21 at 16:22
  • Does the crash involve a kernel error message that you can share in the post? – wxz Sep 20 '21 at 16:23
  • @MarcoBonelli yes I am writing a kernel module to loop through all the kernel pages (be they free or allocated). Code shared in the latest edit. – anon Sep 20 '21 at 16:51
  • @wxz added the crash too – anon Sep 20 '21 at 16:51
  • "supervisor read access in kernel mode" - kernel can't just read all user space memory without explicitly turning off a safety flag. Read [this](https://en.wikipedia.org/wiki/Supervisor_Mode_Access_Prevention) and [this](https://lwn.net/Articles/517475/). You can try using STAC CLAC to turn off that safety bit just while you do the reading, but I expect you'll still run into new issues. – wxz Sep 20 '21 at 16:56
  • @wxz I dont think that is the major problem. The problem is that I dont know what the limit is. Where do the kernel pages end? How much can I read? Even if I can disable safety flags I will hit a wall at some point. – anon Sep 20 '21 at 17:02
  • @anon have you read [this post](https://stackoverflow.com/questions/68091247/confusion-about-different-meanings-of-highmem-in-linux-kernel) about High vs. Low Mem? The top answer includes this ]memory layout document](https://www.kernel.org/doc/Documentation/x86/x86_64/mm.txt) for Linux. – wxz Sep 20 '21 at 17:10
  • You can't do arithmetic on a void pointer. – stark Sep 22 '21 at 11:24

0 Answers0