"Blind" demangling, with precise semantics and a cherry on top

Question

There are lots of questions on SO about name mangling and demangling vis-a-vis the ABI used by gcc and clang. Many of the demangling questions involve trying to get at the semantics embedded in the mangling grammar, as in "Extracting class from demangled symbol". Throughout, solutions posed rely on heuristics that generalize poorly.

What I'm interested in is a robust approach to demangling that:

Precisely tokenizes the mangled name;
Correctly associates C++ semantics with each token;
Does not depend on knowing anything about the library from which the mangled name is drawn; and
Renders the resulting AST for easy consumption by other utilities (think YAML, JSON).

We know #1-#3 can be accomplished because the LLVM implements it. Anybody got a line on a complete toolchain to #4?

Demanging will only give you names, not semantics. You're light-years away from an AST. — Mat, Jan 17 '19 at 20:13
1-3 might make sense. But an AST is a logical representation of what the _code_ does. I don't see how it would have anything to do with the name mangling. — Jeffrey, Jan 17 '19 at 20:14
@Mat see the link to the LLVM demangling code. Unless I'm reading it wrong, it absolutely implements an AST -- a simple token tree with associated semantics. — BrianTheLion, Jan 17 '19 at 20:21
So you want to know what parts of the demangled symbol refer to namespaces, identifiers, types/template names etc.? That's much more than demangling, and never heard of that referred to as an AST (but not an expert in these things). What's wrong with the LLVM implementation you've found then? — Mat, Jan 17 '19 at 20:30
@BrianTheLion I don't claim to understand the whole page context, but I can understand "a fixed AST for parsing mangled names". What I don't think make sense is "generating an AST for a given mangled name". Do you have anything that supports the second option? — Jeffrey, Jan 17 '19 at 20:32
@Mat correct. The LLVM implementation above builds an AST as an intermediate data structure in getting to the demangled name, but it doesn't expose that data structure for consumption by other code AFAICT. — BrianTheLion, Jan 17 '19 at 20:38

score 0 · Answer 1 · answered Jan 18 '19 at 21:48

0

Here's a list of partial solutions in case anyone stumbles on this thread:

1 2 3 4 <---- Criteria listed in the question.
+ + + - llvm::ItaniumPartialDemangler
+ - + - c++filt demangle API

answered Jan 18 '19 at 21:48

BrianTheLion

2,618
2
29
46

Also, why do SO questions about name mangling all seem to be downvoted? – BrianTheLion Jan 19 '19 at 01:01

"Blind" demangling, with precise semantics and a cherry on top

1 Answers1