0

I am porting from Centos to Cygwin and find that my application is exiting with no error message and exit status zero mid execution during the constructor for Botan::InitializationVector.

If I try to attach with gdb proactively in main() where it is waiting on a spin variable, I don't get a normal stack trace:

(gdb) where
#0  0x7c90120f in ntdll!DbgUiConnectToDbg ()
   from /cygdrive/c/WINDOWS/system32/ntdll.dll
#1  0x7c952119 in ntdll!KiIntSystemCall ()
   from /cygdrive/c/WINDOWS/system32/ntdll.dll
#2  0x00000005 in ?? ()
#3  0x00000000 in ?? ()

So with no gdb, it is hard to figure out what is going wrong.

Why would I get no error message on Cygwin yet the application would exit mid execution?

I deduce it is inside the constructor due to clog only showing for line before and not after constructor:

clog << "  About to create iv for Botan.\n";
Botan::InitializationVector iv(_rng, size);
clog << "  About to copy iv for Botan.\n";

Botan is open source: http://botan.randombit.net/ Here are some code snippets from src/sym_algo/symkey.{h,cpp}:

typedef OctetString InitializationVector;

class BOTAN_DLL OctetString
{
public:
  u32bit length() const { return bits.size(); }
  SecureVector<byte> bits_of() const { return bits; }

  const byte* begin() const { return bits.begin(); }
  const byte* end() const   { return bits.end(); }

  std::string as_string() const;

  OctetString& operator^=(const OctetString&);

  void set_odd_parity();

  void change(const std::string&);
  void change(const byte[], u32bit);
  void change(const MemoryRegion<byte>& in) { bits = in; }

  OctetString(class RandomNumberGenerator&, u32bit len);
  OctetString(const std::string& str = "") { change(str); }
  OctetString(const byte in[], u32bit len) { change(in, len); }
  OctetString(const MemoryRegion<byte>& in) { change(in); }
private:
  SecureVector<byte> bits;
};

OctetString::OctetString(RandomNumberGenerator& rng,
                     u32bit length)
{
   bits.create(length);
   rng.randomize(bits, length);
}

I moved the failing code into main() and it works fine. I also put a try catch ... around the code and no exceptions are being thrown. Something goes wrong between main() and the point of failure later in the application. I can do a divide and conquer to narrow down the exact point where it no longer works. One of the Botan developers gave me this stripped down code to use instead that also fails:

Botan::AutoSeeded_RNG _rng;
unsigned int size = 1; // or 16, or 1452 all fail.
Botan::SecureVector<Botan::byte> iv_val(size);
cerr << "We get to here." << endl;
_rng.randomize(&iv_val[0], size);
cerr << "But not here." << endl;

Now that I have the debugger working I see segv:

(gdb) s
Botan::AutoSeeded_RNG::randomize (this=0x1270380, out=0x5841420 "", len=1)
    at ../../src/Botan-1.8.11/build/include/botan/auto_rng.h:23
(gdb) s

Program received signal SIGSEGV, Segmentation fault.
0x005d79ee in Botan::AutoSeeded_RNG::randomize (this=0x1270380, 
    out=0x5841420 "", len=1)
    at ../../src/Botan-1.8.11/build/include/botan/auto_rng.h:23
(gdb) p rng
$7 = (class Botan::RandomNumberGenerator *) 0x5841324
(gdb) p *this
$8 = {<Botan::RandomNumberGenerator> = {
    _vptr$RandomNumberGenerator = 0x11efc14}, rng = 0x5841324}
(gdb) p *rng
$9 = {_vptr$RandomNumberGenerator = 0x656e6f4e}

Here is auto_rng.h code:

class BOTAN_DLL AutoSeeded_RNG : public RandomNumberGenerator
  {
  public:
    void randomize(byte out[], u32bit len)
    { rng->randomize(out, len); }           // SEGV on this line.
    bool is_seeded() const
    { return rng->is_seeded(); }
    void clear() throw() { rng->clear(); }
    std::string name() const
    { return "AutoSeeded(" + rng->name() + ")"; }

    void reseed(u32bit poll_bits = 256) { rng->reseed(poll_bits); }
    void add_entropy_source(EntropySource* es)
    { rng->add_entropy_source(es); }
    void add_entropy(const byte in[], u32bit len)
    { rng->add_entropy(in, len); }

    AutoSeeded_RNG(u32bit poll_bits = 256);
    ~AutoSeeded_RNG() { delete rng; }
  private:
    RandomNumberGenerator* rng;
  };
WilliamKF
  • 41,123
  • 68
  • 193
  • 295
  • I know it's going to be hard for you to provide a test case, but if you're absolutely *sure* it exits inside Botan::InitializationVector's ctor, the code for that ctor might help. – Thomas Edleson Feb 21 '11 at 03:11
  • @Thomas Edleson I deduce from the debug print messages that it fails inside the constructor. The code is all open source, I've included some of it above. – WilliamKF Feb 21 '11 at 04:48
  • @WilliamKF: clog isn't unitbuf as cerr is, so there might be buffered output you don't see. Try cerr. However, if stderr is on a terminal, it's possibly line-buffered which would make this moot. – Thomas Edleson Feb 21 '11 at 04:57
  • ...or use `endl` instead of "\n", since `endl` will flush the stream. – Ben Voigt Feb 21 '11 at 05:00
  • @Ben Voight added endl no different result. – WilliamKF Feb 21 '11 at 17:38
  • @Thomas Edleson I changed to cerr and no different result. – WilliamKF Feb 21 '11 at 17:39
  • @WilliamKF: I'm not surprised, but at least it was a simple change that ruled out an edge case (which could've otherwise indicated you were looking in the completely wrong place). – Thomas Edleson Feb 21 '11 at 17:42
  • Why does `AutoSeeded_RNG` need TWO `RandomNumberGenerator` subobjects? (the base subobject and *rng)??? – Ben Voigt Feb 21 '11 at 18:41

3 Answers3

3

Cygwin apps are multithreaded (e.g., one thread is the signal listener thread). Use info threads in gdb to find the thread that really faulted.

zvrba
  • 24,186
  • 3
  • 55
  • 65
  • That was very helpful! Not used to gdb defaulting to other than the main thread. – WilliamKF Feb 21 '11 at 17:42
  • Issue was somehow related to another class needing to be packed and aligned via the compiler directive: __attribute__ ((aligned(1), packed)) – WilliamKF Feb 21 '11 at 22:09
  • Was that related to something in Botan? If so, they'd better have a *damn good* explanation for using other than default aligning and packing. – zvrba Feb 22 '11 at 07:16
1

You could attach gdb proactively and put a breakpoint right before the constructor call that is failing, then single step through.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • I was not clear, I am proactively attaching at main() and I see that bogus stack trace and am not able to set the spin variable to false to continue execution in gdb. – WilliamKF Feb 21 '11 at 17:30
  • @William: Why are you using a spin variable to halt execution during debug? That's what breakpoints are for. – Ben Voigt Feb 21 '11 at 17:50
  • Voight So that I can attach to running process. I don't run from gdb, but from command line and then attach, so I want code to wait until attach occurs and I clear the variable. – WilliamKF Feb 21 '11 at 17:54
1

Based on the new code, you're violating the Rule of Three and this may be causing your problem.

By defining a class with raw pointers and not providing a correct copy constructor (or making it inaccessible) you open yourself up to double-free.

Add the copy constructor. Or open an issue on the project's bug tracker. You are using the latest version, right?

Community
  • 1
  • 1
Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • 1
    While true in general, in this case, the base class has a hidden copy constructor and operator=, so we are not hitting this issue here. – WilliamKF Feb 21 '11 at 20:05