3

I would like to use gdb's recording, but because glibc's ld.so uses xsave instructions, I get the error "Process record does not support instruction 0xfae64 at address 0x7ffff7fe883c."

I was able to fix a similar error with binary patching thanks to a stackoverflow answer. Compiling glibc with debug symbols failed after running for half an hour, so I'd be glad if there's a quicker solution. I got a compiled version from here, but it looks like there are no earlier versions offered (i.e. I'm using glibc 2.28.r502.g065957a3704-1 now and gdb 8.2.1). How can I make gdb recording work?

Johannes Riecken
  • 2,301
  • 16
  • 17

1 Answers1

7

The problem is that the x86 emulator built into gdb doesn't understand many newer instructions. The only fix for this is waiting for a new version with the relevant instructions patched in. In the meanwhile, this thread suggests a number of workarounds:

  • load the binary with the environment variable LD_BIND_NOW set to 1 to avoid triggering xsave in the dynamic linker
  • alternatively, link the binary you want to debug statically
  • alternatively, link with -z now e.g. by passing -Wl,-z,now to the C compiler
fuz
  • 88,405
  • 25
  • 200
  • 352
  • Static linking is not required, `-Wl,-z,now` for the main program and all libraries will suffice. – Florian Weimer Jan 12 '19 at 22:46
  • Could you explain why that works? Is it that the default function call resolution when symbols are first referenced uses xsave when doing a context switch between the binary and the dynamic linker, but if symbols are resolved at startup, then context will never be switched between binary and dynamic linker? – Johannes Riecken Jan 13 '19 at 07:57
  • @rubystallion Sorry, no idea. – fuz Jan 13 '19 at 13:15
  • @rubystallion: Florian's deleted answer mentions that lazy dynamic linking uses `xsave` in lazy dynamic linking trampolines. I think this might be to avoid SSE/AVX transition stalls, either in the dynamic-linker code, or worse when returning to the main program. It's possible to create a situation where SSE instructions have false dependencies forever: [Why is this SSE code 6 times slower without VZEROUPPER on Skylake?](https://stackoverflow.com/q/41303780). Although I'm not sure why they can't just use `vzeroupper`, because the SysV ABI says a function call clobbers the vector regs. – Peter Cordes Jan 13 '19 at 16:12