30

Just curious, do the GCC or Clang toolsets implement the equivalent of MSVC's identical COMDAT folding (ICF) currently? If not, are there any plans to? I can't seem to find any recent authoritative links on the subject other than old GCC mailing list messages.

If not, does this imply that template instantiations over distinct types are always distinct functions in the resulting binary (in situations where they are not completely inlined), even when they are binary-compatible, or are there other mechanisms in-place to handle this at some other level?

Also, has anyone found ICF making a big difference in minimizing the size of a resulting executable in practice? I don't have any large MSVC projects handy to test it out. (I'm guessing it only really helps if you happened to instantiate templates over many different vtable-layout compatible types.)

Finally, is it C++11 standards-compliant for two functions pointers to different functions to compare equal at runtime? This link seems to imply that it isn't, but it's for C99. EDIT: found previous question on this topic

JMB
  • 137
  • 1
  • 7
Stephen Lin
  • 5,470
  • 26
  • 48
  • 1
    Found a [quote from MSFT's Larry Osterman](http://blogs.msdn.com/b/oldnewthing/archive/2005/03/22/400373.aspx): "And this feature [ICF] is what makes C++ templates a viable solution for applications... Otherwise templates would cause sufficient explosion in code size that they'd be almost unusable for production software."...curious how GCC/Clang get by if they don't do this – Stephen Lin Mar 02 '13 at 00:09
  • 3
    Well they do get by and plenty of software uses templates in the standard library, so clearly it's not true that templates are "almost unusable" – Jonathan Wakely Mar 02 '13 at 00:12
  • @JonathanWakely Hah, I like GCC, that's just his quote :) – Stephen Lin Mar 02 '13 at 00:16
  • 3
    @StephenLin -- GNU ld on ELF and PE platforms will COMDAT fold by-name (i.e. fold instantiations in different translation units with identical type signatures), just not by-section-contents (which is what the MSVC /OPT:ICF does -- it merges COMDATs with different mangled names/signatures, but identical contents). So, in this case, Larry made the outer ring of the dartboard, but not the bullseye. – LThode Nov 25 '14 at 14:56

1 Answers1

20

Neither GCC nor Clang is a linker, and ICF needs to be done by the linker, or at least with cooperation with the linker. Edit: They don't do ICF, so yes, distinct instantiations produce distinct code. The GNU gold linker supports ICF with the --icf option, which needs the GCC option -ffunction-sections to be used.

Distinct functions must have distinct addresses ... I can't remember if ICF is disabled for any function that has its address taken, but if not it should be possible to put a load of no-op instructions before the combined function and make each distinct instantiation start on a different instruction, so they have different addresses. Edit: gold's --icf=safe option only enables ICF for functions that can be proven not to have their address taken, so code that relies on distinct addresses will still work.

ICF is a neat optimization, but not essential. With a bit of effort you can hoist out non-dependent code to a non-template, or a template with fewer parameters, to reduce the quantity of duplicated code in the executable. There's more information on this in the slides for a Diet Templates talk I did a couple of years ago.

Jonathan Wakely
  • 166,810
  • 27
  • 341
  • 521
  • Sorry, I realize I'm being a bit loose with terms...what am I supposed to call the linker-that-is-usually-packaged-with-but-distinct-from-GCC-on-linux-x86/x64? (It's platform-specific, right?) – Stephen Lin Mar 02 '13 at 00:20
  • 1
    That linker is usually just called "GNU ld" or the GNU linker, with the whole set of tools usually called the GNU toolchain (meaning GCC, GNU as, GNU ld, and sometimes including GNU libc). Linkers are often part of the OS, e.g. Solaris has its own linker, but the GNU linker is cross-platform, supporting lots of executable formats and running on lots of different operating systems, so for example GCC on Solaris can be configured to use the native Solaris linker or the GNU linker. – Jonathan Wakely Mar 02 '13 at 00:24
  • thanks for the clarification. clang still uses GNU ld by default, right? actually, I guess that depends on your configuration, seems like they have their own too. – Stephen Lin Mar 02 '13 at 00:28
  • On GNU/Linux, yes ... I'm not sure whether Mac OS X has its own linker or uses GNU ld. Also note that the GNU binutils project contains _two_ linkers, GNU ld and the newer (much faster) gold linker, which is ELF only. – Jonathan Wakely Mar 02 '13 at 00:30
  • thanks, definitely helps to understand toolchains better...anyway, do you know if there's any planned work on something like ICF in gcc/ld? just curious, I realize GCC does "get by" fine without it :D (sorry for the wording) – Stephen Lin Mar 02 '13 at 00:33
  • I don't know of any plans, no – Jonathan Wakely Mar 02 '13 at 00:43
  • 2
    It seems like no-one else on the internet has mentioned your no-op idea for safe ICF, by the way. all the sources I've found [(example)](http://research.google.com/pubs/pub36912.html) insist on avoiding folding functions that have their address taken. maybe you're the first one to think of it, or there's some technical limitation precluding it from working? – Stephen Lin Mar 02 '13 at 00:46
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/25396/discussion-between-jonathan-wakely-and-stephen-lin) – Jonathan Wakely Mar 02 '13 at 00:50
  • 1
    "with a bit of effort you can hoist out non-dependent code" Do you mean effort for the compiler writer or the coder? – pmr Mar 02 '13 at 01:26
  • Effort for the coder. It's just following good design, hoist common code out into its own function. – Jonathan Wakely Mar 02 '13 at 01:39