0

I have the following code to provide a factory to create an encoder/decoder for a given data type and encoding scheme (BITPACK, PLAIN, etc), and the code works.

class Encoding {
 public:
  virtual std::unique_ptr<Encoder> encoder() = 0;

  virtual std::unique_ptr<Decoder> decoder() = 0;
};

template <class E, class D>
class EncodingTemplate : public Encoding {
 public:
  std::unique_ptr<Encoder> encoder() override {
    return std::unique_ptr<Encoder>(new E());
  }

  std::unique_ptr<Decoder> decoder() override {
    return std::unique_ptr<Decoder>(new D());
  }
};

namespace u32 {

class PlainEncoder {...};
class PlainDecoder {...};
class BitpackEncoder {...};
class BitpackDecoder {...};

EncodingTemplate<PlainEncoder, PlainDecoder> plainEncoding;
EncodingTemplate<BitpackEncoder, BitpackDecoder> bitpackEncoding;

Encoding& EncodingFactory::Get(EncodingType encoding) {
  switch (encoding) {
    case PLAIN:
      return plainEncoding;
    case BITPACK:
      return bitpackEncoding;
    default:
      return plainEncoding;
  }
}
}  // namespace u32

// This is where we invoke the encoder
class BlockBuilder {

void Build() {
....
std::unique_ptr<Encoder> key_encoder = u32::EncodingFactory::Get(BITPACK).encoder();
};

Now I want to modify the factory to create a shared encoder/decoder and reuse it. So I add a constructor to EncodingTemplate (This is the first step, and it's just an empty constructor.) Then the code stopped working and AddressSanitizer complains about seg fault.

SEGV on unknown address 0x000000000000 (pc 0x55ca3c817eec bp 0x7fffe7bdd190 sp 0x7fffe7bdd090 T0)
    BlockBuilder::Build(): block_builder.cc:16
    BlockBuilder::Add(leveldb::Slice const&, leveldb::Slice const&)
    BlockReadBenchmark::BlockReadBenchmark()
BlockReadBenchmark_Normal_Benchmark::BlockReadBenchmark_Normal_Benchmark()
    __static_initialization_and_destruction_0
    _GLOBAL__sub_I__Z13binary_sorterii
    __libc_csu_init
    __libc_start_main
    _start

Here's my new EncodingTemplate, it only adds an empty constructor.

class EncodingTemplate : public Encoding {
 public:
    EncodingTemplate() {} // Only add an empty constructor here

  std::unique_ptr<Encoder> encoder() override {
    return std::unique_ptr<Encoder>(new E());
  }

  std::unique_ptr<Decoder> decoder() override {
    return std::unique_ptr<Decoder>(new D());
  }
};

A weird thing is, I can reproduce the error in my project. I comment out the constructor and the code works. I uncomment it, and the error comes back. But when I try to create a minimized working example, I failed to reproduce it.

Can anyone help me figure out what is happening? I really have no idea why a reference to a global variable could cause SEG FAULT. And what does a constructor have to do with it.

Thanks!

Harper
  • 1,794
  • 14
  • 31
  • What does the rest of the call stack look like? I'm guessing that `BlockBuilder::Build` is called as part of a static initialization but the `bitpackEncoding` variable has not been initialized yet.. – Botje Jul 16 '21 at 14:41
  • What does the base `Encoding` class look like? What is/are its constructor(s)? – Adrian Mole Jul 16 '21 at 14:43
  • @Botje @adrian-mole Thanks for the hint. I have added the remaining call stacks to the question. The error happens in my benchmark code with Google Benchmark. I have also added the code of `Encoding`, which is an interface with two pure virtual methods. – Harper Jul 16 '21 at 14:47
  • @Botje Thanks for your insightful hint! I think you are right. So the cause is the random initialization sequence of global variables and static function calls. When I add a constructor, gcc change the init sequence for some reason and my factory is inited after it was called, which cause the problem. Is that correct? – Harper Jul 16 '21 at 14:55
  • And it would be very appreciated if you can give me some suggestions on a better way to implement such factory. – Harper Jul 16 '21 at 14:57
  • Correct. Adding a constructor turns your object from plain old data (which simply maps to a zero-initialized chunk of memory at startup) into something that needs to be explicitly initialized before the program startup. The technical term is "C++ static order initialization fiasco", which has [a few](https://stackoverflow.com/questions/29822181/prevent-static-initialization-order-fiasco-c) [popular questions](https://stackoverflow.com/questions/33913670/the-classical-c-static-initialization-order-fiasco-revisited) – Botje Jul 16 '21 at 14:59

1 Answers1

2

The typical solution to this is the Singleton pattern, since that defers initialization to the first time the object is needed. Since you have a factory already, you can use a function-local static variable:

Encoding& EncodingFactory::Get(EncodingType encoding) {
  static EncodingTemplate<PlainEncoder, PlainDecoder> plainEncoding;
  static EncodingTemplate<BitpackEncoder, BitpackDecoder> bitpackEncoding;

  switch (encoding) {
    case PLAIN:
      return plainEncoding;
    case BITPACK:
      return bitpackEncoding;
    default:
      return plainEncoding;
  }
}
Botje
  • 26,269
  • 3
  • 31
  • 41
  • 2
    Tactical note: you still won't have much control over when the object is destroyed, other than after `main` exits, so be careful about using it in the destructors of other objects with static storage duration. – user4581301 Jul 16 '21 at 15:29