-1

There are lots of questions on SO about name mangling and demangling vis-a-vis the ABI used by gcc and clang. Many of the demangling questions involve trying to get at the semantics embedded in the mangling grammar, as in "Extracting class from demangled symbol". Throughout, solutions posed rely on heuristics that generalize poorly.

What I'm interested in is a robust approach to demangling that:

  1. Precisely tokenizes the mangled name;
  2. Correctly associates C++ semantics with each token;
  3. Does not depend on knowing anything about the library from which the mangled name is drawn; and
  4. Renders the resulting AST for easy consumption by other utilities (think YAML, JSON).

We know #1-#3 can be accomplished because the LLVM implements it. Anybody got a line on a complete toolchain to #4?

BrianTheLion
  • 2,618
  • 2
  • 29
  • 46
  • Demanging will only give you names, not semantics. You're light-years away from an AST. – Mat Jan 17 '19 at 20:13
  • 1-3 might make sense. But an AST is a logical representation of what the _code_ does. I don't see how it would have anything to do with the name mangling. – Jeffrey Jan 17 '19 at 20:14
  • @Mat see the link to the LLVM demangling code. Unless I'm reading it wrong, it absolutely implements an AST -- a simple token tree with associated semantics. – BrianTheLion Jan 17 '19 at 20:21
  • @Jeffrey also see the link to the LLVM demangling code. – BrianTheLion Jan 17 '19 at 20:23
  • So you want to know what parts of the demangled symbol refer to namespaces, identifiers, types/template names etc.? That's much more than demangling, and never heard of that referred to as an AST (but not an expert in these things). What's wrong with the LLVM implementation you've found then? – Mat Jan 17 '19 at 20:30
  • @BrianTheLion I don't claim to understand the whole page context, but I can understand "a fixed AST for parsing mangled names". What I don't think make sense is "generating an AST for a given mangled name". Do you have anything that supports the second option? – Jeffrey Jan 17 '19 at 20:32
  • @Mat correct. The LLVM implementation above builds an AST as an intermediate data structure in getting to the demangled name, but it doesn't expose that data structure for consumption by other code AFAICT. – BrianTheLion Jan 17 '19 at 20:38

1 Answers1

0

Here's a list of partial solutions in case anyone stumbles on this thread:

BrianTheLion
  • 2,618
  • 2
  • 29
  • 46