I am monitoring my test program's memory access using Intel PEBS (Precise Event-Based Sampling). I don't want to use PEBS using the existing PERF infrastructure (for various reasons). So, I have written my own Kernel Module to for Intel PEBS and I handle all the NMI events inside my Kernel Module. I am able to successfully log memory access when the test program natively runs on the host system. Everything works fine.
But when I try to run the test program inside a VM (running over QEMU/KVM), my Guest Kernel crash with error following error. Only the Guest kernel crashes. Host Kernel is stable:
#PF: supervisor read access in user mode.
I have tried playing around disabling/enabling SMEP, SMAP, LAPIC and other options in both Guest and Host kernel. Still crash happens.
I am using Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHZ, Host Kernel Linux Version 5.4.0 (same as the Guest Kernel Version).
Logs from Guest Kernel:
BUG: unable to handle page fault for address: ffff9c122e8a1038
[ 76.011730] #PF: supervisor read access in user mode
[ 76.012190] #PF: error_code(0x0000) - not-present page
[ 76.012603] IDT: 0xfffffe0000000000 (limit=0xfff) GDT: 0xfffffe0000034000 (limit=0x7f)
[ 76.013206] LDTR: NULL
[ 76.013398] TR: 0x40 -- base=0xfffffe0000036000 limit=0x206f
[ 76.013852] PGD 0 P4D 0
[ 76.014041] Oops: 0000 [#1] SMP
[ 76.014274] CPU: 1 PID: 953 Comm: one_page Not tainted 5.4.0-rc7 #4
[ 76.014738] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 76.015394] RIP: 0033:0x55bd4855cd56
[ 76.015661] Code: 85 5c ff ff ff 00 00 00 00 eb 26 48 8b 05 e2 12 20 00 8b 95 5c ff ff ff 48 63 d2 48 c1 e2 02 48 01 d0 8b 00 88
85 5b ff ff ff <83> 85 5c ff ff ff 01 81 bd 5c ff ff ff ff 0f 00 00 7e ce 8b 05 a1
[ 76.017081] RSP: 002b:00007fff48cf8810 EFLAGS: 00010202
[ 76.017477] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 76.018038] RDX: 0000000000000b60 RSI: 00007fff48cf86d0 RDI: 0000000000000002
[ 76.018556] RBP: 00007fff48cf88d0 R08: 0000000000000000 R09: 0000000000000000
[ 76.019077] R10: 0000000000000008 R11: 0000000000000246 R12: 000055bd4855c8c0
[ 76.019599] R13: 00007fff48cf89b0 R14: 0000000000000000 R15: 0000000000000000
[ 76.020116] FS: 00007f2b93ce34c0 GS: 0000000000000000
[ 76.020533] Modules linked in: nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache intel_rapl_msr intel_rapl
_common crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd dax_pmem_compat device_dax nd_pmem glue_he
lper dax_pmem_core nd_btt ppdev joydev parport_pc input_leds nfit intel_rapl_perf mac_hid qemu_fw_cfg serio_raw sch_fq_codel sunrpc
lp parport ip_tables x_tables psmouse virtio_blk virtio_net net_failover failover i2c_piix4 pata_acpi floppy
[ 76.023751] CR2: ffff9c122e8a1038
[ 76.023997] ---[ end trace 10450321a820a090 ]---