4

Given an object file that exposes a symbol, how can I determine if the symbol is also used internally?

My objective is dead function detection. I already have the ability (via readelf) to find if it's used from another object file but this fails when it is only used internally.

If it matters, I'm working with C++

BCS
  • 75,627
  • 68
  • 187
  • 294
  • Inlining and templates might make this difficult... – Bill Lynch Mar 26 '15 at 16:37
  • Templates mostly aren't an issue b/c they are generally only instantiated on use (the exception being explicit specializations) so you only see them in the cases I'm *not* looking for. Inlining is also solvable given that some debuggers are able to report the inlined function despite there being no actual stack frame. – BCS Mar 26 '15 at 16:41
  • " dead function detection" -- for the purpose of warning? Or elision? Doesn't LTO effectively do the latter? – Brian Cain Mar 26 '15 at 17:36
  • 1
    For detection, feeding to deleting the source code. – BCS Mar 26 '15 at 17:41

1 Answers1

1

It actually depends on you compiler how it treats intra-compilation-unit calls, but there are a few recommendations.

For the first thing, if you're optimizing, the compiler may inline your functions even if they are not marked inline. Actually it will strive to do so if the function is marked with the always_inline attribute. As a result there will be no evidence that a function f calls a function g even if it actually does so. Note that if g is itself an externally reachable function, the compiler may generate its code twice (or more), first under its own name for calls from the outside, and then inlined into the object code of f (and other calling functions).

So, avoid optimization and somehow suppress always_inline. You can even explicitly specify -fno-inline to prevent inlining.

Second, your target architecture may have relative call and branch instructions. The compiler may take advantage of that if your f and g are placed into a code section common to them. It is the default for non-inline functions. In such a case, the compiler knows the offset between a place of call and the beginning of the callee at compilation time and can generate a relative call or jump instruction; no further relocation is required. Some compilers may emit a 'no-op' relocation, but some won't. No relocation means that the symbol is not referenced.

So, use -ffunction-sections (and for data -fdata-sections). Every function is then placed into its own section and the compiler will have no other choice but to generate a relocation for linker to fix up (thus making the callee's symbol referenced).

Note that if you use -ffunction-sections and then specify --gc-sections when calling ld, the compiler will discard all unreferenced sections. If you then add -M you will get the resulting module map. Discarded functions will not show up in the map.

As a side note, remember that there are also cases where static analysis is incapable of detecting that a function can never be called. For example, in a well-written C++ program __cxa_pure_virtual will never be called, but nevertheless there will be references to it in virtual function tables of all abstract classes. Moreover, overrides of a plain virtual function will be referred through a virtual function table and linked even if there is no single call of that virtual function in the whole program. Symbolic analysis is unable detect these cases.

ach
  • 2,314
  • 1
  • 13
  • 23
  • I'm willing and able to do a build with whatever flags are needed. I'm also willing to have some false positive if removing them fails the build. – BCS Mar 26 '15 at 17:44
  • 1
    If by 'false positive' you mean a function that is used but not identified as such, then in C++ there is no reliable method to remove it from the source code so that the build would then fail. Suppose you have a `void f(int)` and a `void f(long)`. If you falsely identify `void f(long)` as unused and remove it, then it is quite probable that the compiler won't complain because in all places where it was invoked, overload resolution will now silently choose the other overload. – ach Mar 27 '15 at 16:54
  • On the other hand, because C has no function overload and templates, it is quite safe to do so. – ach Mar 27 '15 at 16:56