115

Today, I discovered a rather interesting thing about either g++ or nm...constructor definitions appear to have two entries in libraries.

I have a header thing.hpp:

class Thing
{
    Thing();

    Thing(int x);

    void foo();
};

And thing.cpp:

#include "thing.hpp"

Thing::Thing()
{ }

Thing::Thing(int x)
{ }

void Thing::foo()
{ }

I compile this with:

g++ thing.cpp -c -o libthing.a

Then, I run nm on it:

%> nm -gC libthing.a
0000000000000030 T Thing::foo()
0000000000000022 T Thing::Thing(int)
000000000000000a T Thing::Thing()
0000000000000014 T Thing::Thing(int)
0000000000000000 T Thing::Thing()
                 U __gxx_personality_v0

As you can see, both of the constructors for Thing are listed with two entries in the generated static library. My g++ is 4.4.3, but the same behavior happens in clang, so it isn't just a gcc issue.

This doesn't cause any apparent problems, but I was wondering:

  • Why are defined constructors listed twice?
  • Why doesn't this cause "multiple definition of symbol __" problems?

EDIT: For Carl, the output without the C argument:

%> nm -g libthing.a
0000000000000030 T _ZN5Thing3fooEv
0000000000000022 T _ZN5ThingC1Ei
000000000000000a T _ZN5ThingC1Ev
0000000000000014 T _ZN5ThingC2Ei
0000000000000000 T _ZN5ThingC2Ev
                 U __gxx_personality_v0

As you can see...the same function is generating multiple symbols, which is still quite curious.

And while we're at it, here is a section of generated assembly:

.globl _ZN5ThingC2Ev
        .type   _ZN5ThingC2Ev, @function
_ZN5ThingC2Ev:
.LFB1:
        .cfi_startproc
        .cfi_personality 0x3,__gxx_personality_v0
        pushq   %rbp
        .cfi_def_cfa_offset 16
        movq    %rsp, %rbp
        .cfi_offset 6, -16
        .cfi_def_cfa_register 6
        movq    %rdi, -8(%rbp)
        leave
        ret
        .cfi_endproc
.LFE1:
        .size   _ZN5ThingC2Ev, .-_ZN5ThingC2Ev
        .align 2
.globl _ZN5ThingC1Ev
        .type   _ZN5ThingC1Ev, @function
_ZN5ThingC1Ev:
.LFB2:
        .cfi_startproc
        .cfi_personality 0x3,__gxx_personality_v0
        pushq   %rbp
        .cfi_def_cfa_offset 16
        movq    %rsp, %rbp
        .cfi_offset 6, -16
        .cfi_def_cfa_register 6
        movq    %rdi, -8(%rbp)
        leave
        ret
        .cfi_endproc

So the generated code is...well...the same.


EDIT: To see what constructor actually gets called, I changed Thing::foo() to this:

void Thing::foo()
{
    Thing t;
}

The generated assembly is:

.globl _ZN5Thing3fooEv
        .type   _ZN5Thing3fooEv, @function
_ZN5Thing3fooEv:
.LFB550:
        .cfi_startproc
        .cfi_personality 0x3,__gxx_personality_v0
        pushq   %rbp
        .cfi_def_cfa_offset 16
        movq    %rsp, %rbp
        .cfi_offset 6, -16
        .cfi_def_cfa_register 6
        subq    $48, %rsp
        movq    %rdi, -40(%rbp)
        leaq    -32(%rbp), %rax
        movq    %rax, %rdi
        call    _ZN5ThingC1Ev
        leaq    -32(%rbp), %rax
        movq    %rax, %rdi
        call    _ZN5ThingD1Ev
        leave
        ret
        .cfi_endproc

So it is invoking the complete object constructor.

osgx
  • 90,338
  • 53
  • 357
  • 513
Travis Gockel
  • 26,877
  • 14
  • 89
  • 116
  • 10
    You're obfuscating your problem with the `-C` flag to `nm`. If you leave it off, you'll see that the constructors that are emitted in fact have different symbols (which is the answer to your second question). I have no idea why two identical constructors are emitted with different symbol names, but I'm trying to read up on that now... more if I figure it out. – Carl Norum Aug 03 '11 at 03:32
  • 3
    Your output looks roughly the same as what I get here - so the question, really, is "what's the difference between the mangled name with a `C1` in it versus that with a `C2` in it?", and I have no answer to that question. I'm surprised the documentation doesn't have more about it.... hrm. – Carl Norum Aug 03 '11 at 03:48
  • Its interesting that the exact same behavior happens in two different compilers. – Travis Gockel Aug 03 '11 at 03:52
  • 1
    I'd be interested to see which one a subclass calls and which one `new` calls... – jswolf19 Aug 03 '11 at 03:56
  • @Ben - the OP says clang does it too. That's not really that surprising though; clang's behaviour often mimics gcc's. – Carl Norum Aug 03 '11 at 03:57
  • @Carl: Ah yes, I skimmed over that sentence. oops. – Ben Voigt Aug 03 '11 at 03:58
  • OK - so I looked in http://cocotron-tools-gpl3.googlecode.com/svn-history/r60/trunk/binutils/libiberty/cp-demangle.c (implementation of `d_ctor_dtor_name`), and found that the `C1` means it's a `gnu_v3_complete_object_ctor`, and a `C2` means that it's a `gnu_v3_base_object_ctor`. Does that mean anything to the C++ experts? I'm guessing it's like @jswolf19 says, they're for subclasses vs new object creation. – Carl Norum Aug 03 '11 at 04:00
  • 2
    Possibly relevant: http://stackoverflow.com/questions/6613870/gnu-gcc-g-why-does-it-generate-multiple-dtors – bdonlan Aug 03 '11 at 04:17

1 Answers1

161

We'll start by declaring that GCC follows the Itanium C++ ABI.


According to the ABI, the mangled name for your Thing::foo() is easily parsed:

_Z     | N      | 5Thing  | 3foo | E          | v
prefix | nested | `Thing` | `foo`| end nested | parameters: `void`

You can read the constructor names similarly, as below. Notice how the constructor "name" isn't given, but instead a C clause:

_Z     | N      | 5Thing  | C1          | E          | i
prefix | nested | `Thing` | Constructor | end nested | parameters: `int`

But what's this C1? Your duplicate has C2. What does this mean?

Well, this is quite simple too:

  <ctor-dtor-name> ::= C1   # complete object constructor
                   ::= C2   # base object constructor
                   ::= C3   # complete object allocating constructor
                   ::= D0   # deleting destructor
                   ::= D1   # complete object destructor
                   ::= D2   # base object destructor

Wait, why is this simple? This class has no base. Why does it have a "complete object constructor" and a "base object constructor" for each?

  • This Q&A implies to me that this is simply a by-product of polymorphism support, even though it's not actually required in this case.

  • Note that c++filt used to include this information in its demangled output, but doesn't any more.

  • This forum post asks the same question, and the only response doesn't do any better at answering it, except for the implication that GCC could avoid emitting two constructors when polymorphism is not involved, and that this behaviour ought to be improved in the future.

  • This newsgroup posting describes a problem with setting breakpoints in constructors due to this dual-emission. It's stated again that the root of the issue is support for polymorphism.

In fact, this is listed as a GCC "known issue":

G++ emits two copies of constructors and destructors.

In general there are three types of constructors (and destructors).

  • The complete object constructor/destructor.
  • The base object constructor/destructor.
  • The allocating constructor/deallocating destructor.

The first two are different, when virtual base classes are involved.


The meaning of these different constructors seems to be as follows:

  • The "complete object constructor". It additionally constructs virtual base classes.

  • The "base object constructor". It creates the object itself, as well as data members and non-virtual base classes.

  • The "allocating object constructor". It does everything the complete object constructor does, plus it calls operator new to actually allocate the memory... but apparently this is not usually seen.

If you have no virtual base classes, [the first two] are are identical; GCC will, on sufficient optimization levels, actually alias the symbols to the same code for both.

Community
  • 1
  • 1
Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • 5
    Hooray for an answer - I think I was closing in on this, but it's good to see the right information. – Carl Norum Aug 03 '11 at 04:05
  • 7
    @Tomalak Geret'kal: +1, for a very detailed research for answering the Q. – Alok Save Aug 03 '11 at 04:19
  • 4
    This is an awesome answer, but is there documentation for that the difference between these constructor types? Mostly: What is an "allocating constructor" and a "deleting destructor"? Are they for overloading `operator new` and `operator delete`? – Travis Gockel Aug 03 '11 at 04:22
  • @Travis: I'm not entirely sure yet. bdonlan [argh, SO, quit limiting my notifications in comments FFS] pointed out [this highly-related question](http://stackoverflow.com/questions/6613870/gnu-gcc-g-why-does-it-generate-multiple-dtors), and there appears to be lots of pertinent information there. – Lightness Races in Orbit Aug 03 '11 at 04:36
  • @Travis: Yes, I think that they are. I don't want this answer to turn into general documentation for the entire construction/destruction process, but I briefly cover that in my latest edit. – Lightness Races in Orbit Aug 03 '11 at 15:02
  • On the *apparently not usually seen* comment, the Itanium ABI does not require generation or use of *allocating object constructor* for classes without a virtual destructor. Furthermore, gcc implementation *never* generates or uses it. [It does generate and use the deallocating constructor, but not the allocating constructor, and I don't quite understand where that constructor would be useful at all...] – David Rodríguez - dribeas Jun 16 '15 at 08:44