23

Today a colleague of mine came and asked me the question as mentioned in the title.
He's currently trying to reduce the binaries footprint of a codebase, that is also used on small targets (like Cortex M3 and alike). Apparently they have decided to compile with RTTI switched on (GCC actually), to support proper exception handling.

Well, his major complaint was why std::type_info::name() is actually needed at all for support of RTTI, and asked, if I know a way to just suppress generation of the string literals needed to support this, or at least to shorten them.

std::type_info::name

const char* name() const; Returns an implementation defined null-terminated character string containing the name of the type. No guarantees are given, in particular, the returned string can be identical for several types and change between invocations of the same program.

A ,- however compiler specific -, implementation of e.g. the dynamic_cast<> operator would not use this information, but rather something like a hash-tag for type determination (similar for catch() blocks with exception handling).
I think the latter is clearly expressed by the current standard definitions for

  1. std::type_info::hash_code
  2. std::type_index

I had to agree, that I also don't really see a point of using std::type_info::name(), other than for debugging (logging) purposes. I wasn't a 100% sure that exception handling will work just without RTTI with current versions of GCC (I think they're using 4.9.1), so I hesitated to recommend simply switching off RTTI.
Also it's the case that dynamic_casts<> are used in their code base, but for these, I just recommended not to use it, in favor of static_cast (they don't really have something like plugins, or need for runtime type detection other than assertions).


Question:

  • Are there real life, production code level use cases for std::type_info::name() other than logging?

Sub-Questions (more concrete):

  • Does anyone have an idea, how to overcome (work around) the generation of these useless string literals (under assumption they'll never be used)?

  • Is RTTI really (still) needed to support exception handling with GCC?
    (This part is well solved now by @Sehe's answer, and I have accepted it. The other sub-question still remains for the left over generated std::type_info instances for any exceptions used in the code. We're pretty sure, that these literals are never used anywhere)


Bit of related: Strip unused runtime functions which bloat executable (GCC)

Community
  • 1
  • 1
πάντα ῥεῖ
  • 1
  • 13
  • 116
  • 190
  • 3
    Not really useful for logging either, since the name can be a mangled name which doesn't really say much directly. – Some programmer dude Mar 04 '15 at 18:13
  • 5
    I doubt that anything explicitly marked as "implementation defined" without the most basic guarantee of equality should be used in production, even if it is 100% safe. – Sergey Kalinichenko Mar 04 '15 at 18:16
  • 3
    @JoachimPileborg Even the reference documentation mentions `c++filt`: _"With compilers such as gcc and clang, the returned string can be piped through c++filt -t to be converted to human-readable form."_ – πάντα ῥεῖ Mar 04 '15 at 18:16
  • Is `type_index` guaranteed to be constant across runs, or different programs using some of the same libraries? I doubt it. If not, then `name()` may be convenient as an implementation defined (which does not mean univerally useless by the way - just non-portable and subject to pretty arbitrary but usually rare change / consult implementation docs) to differentiate types across programs/runs e.g. provide type discrimination text for serialisation / deserialisation / factory code. If it happens to be "legible", that could add to the appeal. – Tony Delroy Mar 04 '15 at 18:24
  • 3
    @JoachimPileborg Because it may be useless on *some* implementations, you wouldn't use it on *any* implementation? – user253751 Mar 04 '15 at 18:30
  • 1
    @TonyD: You seem to be assuming that `name()` returns different things for each type, but that may not be the case. – Mooing Duck Mar 04 '15 at 18:31
  • I had always assumed that compilers didn't actually embed the `type_info` data in your binary except for the types it's actually called for. Is there evidence that this data is embedded for all types? I don't think the _existance_ of `type_info` affects binary size so much as the _usage_. – Mooing Duck Mar 04 '15 at 18:32
  • 1
    @MooingDuck: no I am not assuming that in general, but obviously the kind of usage I mentioned would only make sense on a particular implementation or set of implementations where that was true. More generally - not addressed to MooingDuck specifically - I find it really annoying that so many C++ enthusiasts are deliberately obtuse about this distinction (then turn around and use error numbers, IPC libraries, boost facilities etc. that are either non-C++-Standard or far more likely to change). – Tony Delroy Mar 04 '15 at 18:32
  • @MooingDuck _"I had always assumed that compilers didn't actually embed the name() strings in your code except for the types it's actually called for"_ Apparently it's not like this. My co-worker complained about ~12KB used fore these, and there's not a single call to `std::type_info::name()` anywhere. – πάντα ῥεῖ Mar 04 '15 at 18:37
  • @πάνταῥεῖ and that 12KB was what percentage of the executable size? If the text involved is contiguous in virtual address space, there's a fair chance it won't even be faulted from disk if unused.... – Tony Delroy Mar 04 '15 at 18:38
  • @TonyD I think they have 256KB with keeping a fallback image in the flash, so their maximum executable size effectively is up to 127KB (could be 512/256 also), it's certainly significant. – πάντα ῥεῖ Mar 04 '15 at 18:41
  • @πάνταῥεῖ Was it using `std::type_info` for anything else? – Mooing Duck Mar 04 '15 at 18:45
  • @πάνταῥεῖ: If you're running on something that small, it seems like you would consider turning RTTI off. – Dietrich Epp Mar 04 '15 at 18:48
  • @MooingDuck As far I understood the code doesn't use `std::type_info` directly anywhere. So as long it's not used from any other standard library functions or classes, the reasons for `std::type_info` instantiation is merely `dynamic_cast<>` usage and exception handling. – πάντα ῥεῖ Mar 04 '15 at 18:50
  • @DietrichEpp Of course, if you do that just at front. I mentioned, what were the reasons, I didn't just chase him away with this argument. – πάντα ῥεῖ Mar 04 '15 at 18:51
  • 5
    Regarding sub-question #1: for small embedded targets, this is often the problem domain where a linker script can help you. Essentially, if the literals get placed in a particular section in the binary (I'm not sure where gcc puts them), you could exclude that section. Failing that, you could explicitly exclude symbol names that match a pattern; you may be able to devise a scheme to suppress the literals in the resulting binary (which would obviously fail at runtime if they were ever used). – Jason R Mar 04 '15 at 19:02
  • @JasonR That's the 1st really useful hint on this question. Yes we discussed, if it's likely these strings are about in a designate linker section, and if it simply could be bailed out with the linker script. – πάντα ῥεῖ Mar 04 '15 at 19:04
  • Related: http://stackoverflow.com/questions/11403136/is-it-possible-to-strip-type-names-from-executable-while-keeping-rtti-enabled/11423631#11423631 and http://stackoverflow.com/questions/14869639/remove-c-class-names-from-binary-dll-file – sehe Mar 04 '15 at 19:18
  • @sehe THX for your efforts and contributions, unfortunately these links both refer to MSVC's implementation, while my question is clearly about GCC. – πάντα ῥεῖ Mar 04 '15 at 19:32
  • I'm aware of that. Although your claim is not fully accurate (first one: "Does any compiler provide such an option?"). Comments are well suited for sharing potentially related sources. Also, all the answers converge to: look at obfuscation to hide the strings (!!!?! :() – sehe Mar 04 '15 at 19:33
  • If you don't need dynamic_cast, don't use it. Static_cast is generally equivalent to C-casting so you can use either, and the referred to type doesn't change, just the reference type. Really, I'd like to hear from you if, when your colleague replaces all dynamic_casts with static or c-style casts and then explicitly removes RTTI will it compile and if it does, is the 12KB still there? If it is, you know the exceptions are forcing it to generate. If not, problem solved with no runtime segmentation faults possible, which you risk with linker stripping, very bad in embedded systems! – TimeHorse Mar 04 '15 at 21:02
  • @TimeHorse Yeah, that's blatantly obvious, and I mentioned it in my question already. The point is, if they switch off RTTI, will exception handling still work properly in GCC? – πάντα ῥεῖ Mar 04 '15 at 21:04
  • @πάνταῥεῖ I'd suggest you try to compile it as I suggested and see if it's any bigger. If it still overruns by 12KB, then yes, exception handling is forcing RTTI; if not, then you've solved the problem. BTW, is the program throwing types or immediates? If all you want from exception handling is to get out quickly and don't care why, you could consider throwing integers which might also eliminate RTTI if exception handling is preserving it. – TimeHorse Mar 04 '15 at 21:12
  • @TimeHorse _"I'd suggest you try to compile it as I suggested and see if it's any bigger."_ Well, that was what I've suggested too, to get rid of the `dynamic_casts<>`, switch off RTTI, and see how enabling and using exceptions affects the footprint. _" you could consider throwing integers"_, no I think they want to use at least classes from the `std::exception` hierarchy. Anyway I'd expect that generating RTTI info should be forced for exception types appearing in catch blocks solely (not for any type). – πάντα ῥεῖ Mar 04 '15 at 21:33
  • @πάνταῥεῖ Please let us know when you have a result of the static_cast / no RTTI test. I think the crux of this question hinges on if your particular gcc will respect your request to turn off RTTI despite having exception handling. I agree it's likely typed exceptions may be forcing RTTI but maybe it's for the best your colleague bite the bullet and just convert the whole thing from throw-catch to return codes. It's a nightmare to deal with but it will make your embedded product leaner and meaner. – TimeHorse Mar 04 '15 at 21:39
  • @TimeHorse _"but maybe it's for the best your colleague bite the bullet and just convert the whole thing from throw-catch to return codes"_ LOL, that's what we (different department, different code base), are actually doing. I never liked it as being a general part of the architecture, and still mumbling against it like Cicero. But that's the more interoperable and flexible approach, I have to admit in the end :-P ... – πάντα ῥεῖ Mar 04 '15 at 21:46
  • 1
    @TimeHorse _"Please let us know ..."_ Of course. It looks like my comrade and me have to provide an answer, with what we came up finally then. – πάντα ῥεῖ Mar 04 '15 at 21:49
  • @dasblinkenlight The point is, that my co-worker claims, **it isn't used** (directly) anywhere. – πάντα ῥεῖ Mar 04 '15 at 21:51
  • @JasonR _"Essentially, if the literals get placed in a particular section in the binary (I'm not sure where gcc puts them), you could exclude that section."_ To keep you informed: We found out that these string literals aren't placed in a special section, they're just strayed all over the ordinary text section. It'll be a hard job, to rip them out via linker script. Well, `-fno-rtti` and eliminating `dynamic_cast<>` works well so far, but there's still that (minor) open point, to avoid the literals being generated for the used exception types. So if you have any additional idea? – πάντα ῥεῖ Mar 05 '15 at 19:05
  • @πάνταῥεῖ: Is there any pattern to the symbol names that it associates with the generated literals? You might be able to strip them from the binary based on that structure. – Jason R Mar 05 '15 at 19:59
  • @JasonR _"is there any pattern ..."_ Well, I didn't spot any reasonably usable one (but didn't look that deep to be honest) . We'll check that tomorrow. – πάντα ῥεῖ Mar 05 '15 at 20:02
  • See the comment at the top of GCC's ``, which discusses using `strcmp` on the name to determine if two types are the same. I don't remember when string equality is used rather than pointer equality. – Jonathan Wakely Mar 05 '15 at 20:58
  • @JonathanWakely Sorry, I don't really get in which way this is relevant? We aren't using it. Did you mean an actual implementation for `dynamic_cast<>`? – πάντα ῥεῖ Mar 05 '15 at 21:01
  • The run-time can use it implicitly, even if you aren't using it. – Jonathan Wakely Mar 05 '15 at 21:01
  • @JonathanWakely What do you mean _implicitly_? There's no evidence from the linker maps, that stuff is actually used. Have a look at my footnote link, how to bail out the GCC standard implementation for uncaught exception unwinds used with `atexit()` (the _run-time_??). – πάντα ῥεῖ Mar 05 '15 at 21:15

2 Answers2

16

Isolating this bit:

The answer is yes:

-fno-rtti

Disable generation of information about every class with virtual functions for use by the C++ runtime type identification features (dynamic_cast and typeid). If you don't use those parts of the language, you can save some space by using this flag. Note that exception handling uses the same information, but it will generate it as needed. The dynamic_cast operator can still be used for casts that do not require runtime type information, i.e. casts to void * or to unambiguous base classes.

Community
  • 1
  • 1
sehe
  • 374,641
  • 47
  • 450
  • 633
  • 1
    Yeah, that's pretty definite about the doubts I had. I'll accept this answer (let's see what my co-worker will come up with tomorrow..) Also is there a chance that the overriden `__verbose_terminate_handler` doesn't use any exception related stuff anymore, will even bail out `std::type_info::name()` support for types used in exceptions? I'm sure my co-worker would jump in triangles for pleasure, and show some somersaults in between :-). – πάντα ῥεῖ Mar 04 '15 at 22:55
  • 3
    If you're targeting small devices then build GCC with `--disable-libstdcxx-verbose` to automatically get rid of the verbose terminate handler (and all its I/O dependencies) – Jonathan Wakely Mar 05 '15 at 20:51
7

Are there real life, production code level use cases for std::type_info::name() other than logging?

The Itanium ABI describes how operator== for std::type_info objects can be easily implemented in terms of testing strings returned from std::type_info::name() for pointer equality.

In a non-flat address space, where it might be possible to have multiple type_info objects for the same type (e.g. because a dynamic library has been loaded with RTLD_LOCAL) the implementation of operator== might need to use strcmp to determine if two types are the same.

So the name() function is used to determine if two type_info objects refer to the same type. For examples of real use cases, that's typically used in at least two places in the standard library, in std::function<F>::target<T>() and std::get_deleter<D>(const std::shared_ptr<T>&).

If you're not using RTTI then all that's irrelevant, as you won't have any type_info objects anyway (and consequently in libstdc++ the function::target and get_deleter functions can't be used).

I think GCC's exception-handling code uses the addresses of type_info objects themselves, not the addresses of the strings returned by name(), so if you use exceptions but no RTTI the name() strings aren't needed.

Jonathan Wakely
  • 166,810
  • 27
  • 341
  • 521
  • If the `type_info` are not the same, how can you say that the types are the same? – curiousguy Sep 14 '15 at 17:51
  • Because the two `type_info` objects compare equal. – Jonathan Wakely Sep 14 '15 at 18:17
  • Then why don't the `name()` compare equal? – curiousguy Sep 14 '15 at 21:44
  • 1
    I'm not sure what you're asking, could you clarify? As I said above (_"So the `name()` function is used to determine if two type_info objects refer to the same type."_), if they compare equal then the names do compare equal (either by pointer equality or by `strcmp`), so asking why the names don't compare equal makes no sense. What do you mean by "not the same"? – Jonathan Wakely Sep 14 '15 at 21:55
  • I don't understand how that could guarantee correct values. According to the spec, "the returned string can be identical for several types and change between invocations of the same program." Such a comparison may return true for disparate types, while also returning false for two objects of the same type. Whether it works at all depends on the implementation. – bindsniper001 May 08 '23 at 22:07
  • 1
    @bindsniper001 N.B. that's cppreference.com which is not actually the spec (just a very good reference describing the real spec). The only way for two different types to have the same `type_info::name()` is if at least one of the types has internal linkage. e.g. defined in an anonymous namespace. In that case the runtime ensures that `type_info::operator==` does **not** just use `strcmp` on the names. Instead it typically uses a pointer equality test, and the strings "foo" from one source file and "foo" from another source file will have different addresses, so `type_info::operator==` is false – Jonathan Wakely May 09 '23 at 11:06
  • 1
    In practice it's a little more complicated, e.g. with GCC the `type_info::__name` member will point to a string `"*foo"` for the type with internal linkage, and the initial `"*"` character says to use pointer comparison not `strcmp`. The `type_info::name()` function returns `__name+1` so that the initial `"*"` is not part of the public "name", but is still visible to the implementation and still affects equality comparisons. In other words, when the spec talks about "the returned string", that's not necessarily the same as the real type name the implementation uses. – Jonathan Wakely May 09 '23 at 11:09