5

context: I am using thread sanitizer for my program, and it shows my program has data race. I'm 100% sure why (maybe there's too much memory access), thread sanitizer doesn't give out the exact stacktrace for the invalid access. There's only

Previous read of size 4 at 0x7b1800004140 by thread T36:
    [failed to restore the stack]

I have tried to set the history_size=7 when running thread sanitizer according to the doc, but still cannot get the stacktrace.

I know C/C++ supports address randomization due to security concern, so I use -fno-stack-protector to compile my program, which makes the invalid address access the same for multiple times of running.

I want to know how could I get which variable(s) resides on this address, so that I could directly get where am I illegally accessing it?

I use bazel as my build system, which indirectly relies on clang v13.0.0.

Tinyden
  • 524
  • 4
  • 13
  • 1
    Well, that would need having a linker map of your symbols maintained and analyzed. – πάντα ῥεῖ Apr 24 '22 at 06:06
  • 2
    Did yo u compile with `-g`, and link without `-s`? – Nate Eldredge Apr 24 '22 at 06:09
  • @NateEldredge I do compile with `-g`, and not link with `-s`. So the symbol table should be kept. Confirmed with `file` command, and it shows `with debug_info, not stripped`. – Tinyden Apr 24 '22 at 06:14
  • What about optimizations? Perhaps the variable does not exist anymore? Try `-g3 -O0` – Quimby Apr 24 '22 at 06:20
  • 1
    @Quimby I originally tried with `-g, -O1`, just tried `-g3, -O0`, still cannot get the stack-trace :( – Tinyden Apr 24 '22 at 06:36
  • 1
    Perhaps the stack is corrupted? If you make the code single-threaded, do you get the same problem? Two threads? Three? How many threads are needed to reliably replicate the problem? And have you tried the address and UB sanitizers? Or other tools such as Valgrind? – Some programmer dude Apr 24 '22 at 06:45
  • @Someprogrammerdude About the stack smash, I suspected that, but not sure how to verify? asan and ubsan doesn't give out error. – Tinyden Apr 24 '22 at 06:49
  • 1
    Dumb idea, but if you know the address does not change, can you set GDB breakpoint on every write, start the program and see where the breakpoint hits? It will hit before the race condition but at least GDB might show the variable. – Quimby Apr 24 '22 at 06:59
  • 1
    Depending on the size of the code, another possibility would be to `printf` the address of all variables manually from within the code, and hope that the problematic one shown by the sanitizer is among them. – Sedenion Apr 24 '22 at 07:03
  • @Sedenion it's prod code in my company, hard to tell print address for all variables :( I tried to print several of them, but unfortunately haven't met the target addr. – Tinyden Apr 24 '22 at 07:05
  • @Quimby I just tried `awatch *(void*)0x7b1800004140`, but the program still reports data race, while show `Previous read of size 4 at 0x7b1800004140 by thread T36` – Tinyden Apr 24 '22 at 07:44
  • @Dentiny Hmm, that's unfortunate, if you compile the code without sanitizers, the addresses surely won't match. Maybe you can [suppress](https://github.com/google/sanitizers/wiki/ThreadSanitizerSuppressions) all races with `race::*` and run with GDB, hopefully this time, the sanitizer will ignore the race condition and actually let it execute and break into GDB. – Quimby Apr 24 '22 at 11:11
  • @Quimby I do compile my program with thread-sanitizer enabled. – Tinyden Apr 24 '22 at 18:39
  • Some additional context: I print `/proc/self/maps` at the end of my binary, it shows the address accessed is not mapped to any segment (I mean stack, heap, or shared object). – Tinyden May 05 '22 at 05:11

0 Answers0