0

I am monitoring my test program's memory access using Intel PEBS (Precise Event-Based Sampling). I don't want to use PEBS using the existing PERF infrastructure (for various reasons). So, I have written my own Kernel Module to for Intel PEBS and I handle all the NMI events inside my Kernel Module. I am able to successfully log memory access when the test program natively runs on the host system. Everything works fine.

But when I try to run the test program inside a VM (running over QEMU/KVM), my Guest Kernel crash with error following error. Only the Guest kernel crashes. Host Kernel is stable:

#PF: supervisor read access in user mode.

I have tried playing around disabling/enabling SMEP, SMAP, LAPIC and other options in both Guest and Host kernel. Still crash happens.

I am using Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHZ, Host Kernel Linux Version 5.4.0 (same as the Guest Kernel Version).

Logs from Guest Kernel:

BUG: unable to handle page fault for address: ffff9c122e8a1038
[   76.011730] #PF: supervisor read access in user mode
[   76.012190] #PF: error_code(0x0000) - not-present page
[   76.012603] IDT: 0xfffffe0000000000 (limit=0xfff) GDT: 0xfffffe0000034000 (limit=0x7f)
[   76.013206] LDTR: NULL
[   76.013398] TR: 0x40 -- base=0xfffffe0000036000 limit=0x206f
[   76.013852] PGD 0 P4D 0
[   76.014041] Oops: 0000 [#1] SMP
[   76.014274] CPU: 1 PID: 953 Comm: one_page Not tainted 5.4.0-rc7 #4
[   76.014738] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[   76.015394] RIP: 0033:0x55bd4855cd56
[   76.015661] Code: 85 5c ff ff ff 00 00 00 00 eb 26 48 8b 05 e2 12 20 00 8b 95 5c ff ff ff 48 63 d2 48 c1 e2 02 48 01 d0 8b 00 88
 85 5b ff ff ff <83> 85 5c ff ff ff 01 81 bd 5c ff ff ff ff 0f 00 00 7e ce 8b 05 a1
[   76.017081] RSP: 002b:00007fff48cf8810 EFLAGS: 00010202
[   76.017477] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[   76.018038] RDX: 0000000000000b60 RSI: 00007fff48cf86d0 RDI: 0000000000000002
[   76.018556] RBP: 00007fff48cf88d0 R08: 0000000000000000 R09: 0000000000000000
[   76.019077] R10: 0000000000000008 R11: 0000000000000246 R12: 000055bd4855c8c0
[   76.019599] R13: 00007fff48cf89b0 R14: 0000000000000000 R15: 0000000000000000
[   76.020116] FS:  00007f2b93ce34c0 GS:  0000000000000000
[   76.020533] Modules linked in: nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache intel_rapl_msr intel_rapl
_common crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd dax_pmem_compat device_dax nd_pmem glue_he
lper dax_pmem_core nd_btt ppdev joydev parport_pc input_leds nfit intel_rapl_perf mac_hid qemu_fw_cfg serio_raw sch_fq_codel sunrpc
 lp parport ip_tables x_tables psmouse virtio_blk virtio_net net_failover failover i2c_piix4 pata_acpi floppy
[   76.023751] CR2: ffff9c122e8a1038
[   76.023997] ---[ end trace 10450321a820a090 ]---
  • Perhaps better to use `perf` for that. – 0andriy Mar 09 '20 at 09:37
  • That was our first choice - to use perf. But the overhead using we measured with perf infrastructure is high for our requirement. – Ganapathy Raman Mar 09 '20 at 12:24
  • 2
    Can you provide a sample of the kernel module you wrote, and add it to the question ? – Arnabjyoti Kalita Mar 09 '20 at 18:18
  • There are some an existing modules for using perf counters from user-space, including a light weight `libpfc`. Links in [What will be the exact code to get count of last level cache misses on Intel Kaby Lake architecture](https://stackoverflow.com/a/45133491). I'd suggest giving those a try, or a look; they might do what you need without having to debug your own. – Peter Cordes Mar 10 '20 at 04:11

1 Answers1

0

I get a similar message running a guest=ubuntu 19.10/x86_64 on QEMU/tcg where host=ubuntu 19.10/x86_64. The guest app is dotnet. I am unable to hit the bug with applications from linux test project or my own directed testing. Please see https://bugs.launchpad.net/qemu/+bug/1866892 The app runs fine without qemu in the way.