4

Some Context
I'm writing a transparent/intercepting, HTTPS capable proxy in C++ using openSSL. I'm redirecting traffic through my proxy using WinDivert. For my SSL initialization, my HTTPSAcceptor generates a temporary EC_KEY for the entire server context for the handshake operation. I keep an in-memory "store" (Not an actual X509_STORE object) where I spoof and store certificates using host/SAN DNS entries as the lookup keys. As a side note, this is the first time I've ever worked with openSSL so please correct and pardon any ignorance in my approach. :) Also pardon the excessive abuse of cout for puking debugging/errors, these will later be wrapped into a logger.

The Meat
Anyway so with that, when I get an incoming HTTPS connection, I either retrieve or spoof then retrieve the real upstream certificate. When I generate these certs, I'm using EC Keys. Les code:

EC_KEY *ecdh = NULL;

if ((ecdh = EC_KEY_new_by_curve_name(NID_X9_62_prime256v1)) == NULL || EC_KEY_generate_key(ecdh) != 1)
{
    std::cout << "In CertStore::GenerateEcKey() - Failed to generate EC_KEY" << std::endl;
}
else
{
    EC_KEY_set_asn1_flag(ecdh, OPENSSL_EC_NAMED_CURVE);

    EVP_PKEY* pkey = NULL;

    pkey = EVP_PKEY_new();

    if (pkey == nullptr)
    {
        std::cout << "In CertStore::GenerateEcKey() - Failed to generate EVP_PKEY" << std::endl;
    }
    else
    {
        if (1 != EVP_PKEY_set1_EC_KEY(pkey, ecdh))
        {
            std::cout << "In CertStore::GenerateEcKey() - Failed EVP_PKEY_set1_EC_KEY" << std::endl;
            EVP_PKEY_free(pkey);
            return nullptr;
        }else{

            EC_KEY_up_ref(ecdh);
            return pkey;
        }
    }
}

Upon successfully fetching a spoofed cert & associated key, I then tell my SSL* object to use these for the handshake, obviously.

if (SSL_use_PrivateKey(m_secureDownstreamSocket->native_handle(), upKey) <= 0)
{
    std::cout << "set private key failed" << std::endl;
    Kill();
    return;
}

if (SSL_use_certificate(m_secureDownstreamSocket->native_handle(), upCert) <= 0)
{
    std::cout << "set use cert failed" << std::endl;
    Kill();
    return;
}

m_secureDownstreamSocket->async_handshake(SslSocket::server, m_strand.wrap(boost::bind(&HttpsBridge::OnDownstreamHandshake, shared_from_this(), boost::asio::placeholders::error)));

However, this causes seems to be the origin of my application dying a terrible death. I used to generate a new CTX on each HTTPS connection (both client and server), but after doing some reading in the docs and a few SO posts, I was led to believe that the correct way is to use a global context for creating SSL objects. Anyway the point of mentioning that was that the error I'm getting, which I'll present shortly, used to occur very rarely when I was being dumb and creating a ton of CTX's. Since changing to two global CTX's (client, server), this error now occurs very quickly, but still at random points.

The error, is that somehow, EC_GROUP's from the keys are being double freed. The issue is that I don't even know why they are being freed in the first place. I can find no mention in the docs of any of the SSL_* or SSL_CTX* methods that I use freeing anything. Below is the trace of events from App Verifier, since Eclipse is being useless at debugging this and visual studio debugger simply somehow refuses to work while I'm intercepting and processing local network traffic. Please interwebs, help. :(

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<avrf:logfile xmlns:avrf="Application Verifier">
    <avrf:logSession TimeStarted="2015-04-05 : 23:51:30" PID="812" Version="2">
        <avrf:logEntry Time="2015-04-05 : 23:51:57" LayerName="Heaps" StopCode="0x7" Severity="Error">
            <avrf:message>Heap block already freed.</avrf:message>
            <avrf:parameter1>8411000 - Heap handle for the heap owning the block.</avrf:parameter1>
            <avrf:parameter2>aac49270 - Heap block being freed again.</avrf:parameter2>
            <avrf:parameter3>20 - Size of the heap block.</avrf:parameter3>
            <avrf:parameter4>0 - Not used</avrf:parameter4>
            <avrf:stackTrace>
                <avrf:trace>vrfcore!VerifierDisableVerifier+948 ( @ 0)</avrf:trace>
                <avrf:trace>verifier!VerifierStopMessage+a0 ( @ 0)</avrf:trace>
                <avrf:trace>verifier!VerifierDisableFaultInjectionExclusionRange+318b ( @ 0)</avrf:trace>
                <avrf:trace>verifier!VerifierDisableFaultInjectionExclusionRange+8a6 ( @ 0)</avrf:trace>
                <avrf:trace>verifier!VerifierDisableFaultInjectionExclusionRange+94b ( @ 0)</avrf:trace>
                <avrf:trace>verifier!VerifierCheckPageHeapAllocation+40 ( @ 0)</avrf:trace>
                <avrf:trace>vfbasics!+7ff99e7f3773 ( @ 0)</avrf:trace>
                <avrf:trace>msvcrt!setjmp+123 ( @ 0)</avrf:trace>
                <avrf:trace>vfbasics!+7ff99e7f4606 ( @ 0)</avrf:trace>
                <avrf:trace>LIBEAY32!CRYPTO_free+2b ( @ 0)</avrf:trace>
                <avrf:trace>LIBEAY32!BN_free+29 ( @ 0)</avrf:trace>
                <avrf:trace>LIBEAY32!EC_GROUP_cmp+307 ( @ 0)</avrf:trace>
                <avrf:trace>LIBEAY32!EC_GROUP_free+2c ( @ 0)</avrf:trace>
                <avrf:trace>LIBEAY32!EC_KEY_set_group+2b ( @ 0)</avrf:trace>
                <avrf:trace>LIBEAY32!EC_GF2m_simple_method+180f ( @ 0)</avrf:trace>
                <avrf:trace>SSLEAY32!SSL_use_PrivateKey_ASN1+1a5 ( @ 0)</avrf:trace>
                <avrf:trace>SSLEAY32!SSL_use_certificate+9a ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+407ef6 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+4a2dbf ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+4a2e0a ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+45b6b1 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+45c50e ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+488870 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+461241 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+451908 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+47d3a0 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+451938 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+472739 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+45e9c4 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+474001 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+4a4098 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+465a7d ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+488af1 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+47774c ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+461001 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+451488 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+47ce40 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+4514b8 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+478de7 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+470f8b ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+45e2c7 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+47d3f4 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+451e18 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+464f44 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+451e48 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+4819b1 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+47cc68 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+47f2d2 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+47ecb8 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+45db6c ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+4a2c75 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+45b32c ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+45ce36 ( @ 0)</avrf:trace>
                <avrf:trace>SSITM!+48ce4e ( @ 0)</avrf:trace>
                <avrf:trace>libboost_thread!ZN5boost6detail23get_current_thread_dataEv+729 ( @ 0)</avrf:trace>
                <avrf:trace>msvcrt!strupr+c3 ( @ 0)</avrf:trace>
                <avrf:trace>msvcrt!endthreadex+9d ( @ 0)</avrf:trace>
                <avrf:trace>vfbasics!+7ff99e7fc729 ( @ 0)</avrf:trace>
                <avrf:trace>KERNEL32!BaseThreadInitThunk+22 ( @ 0)</avrf:trace>
                <avrf:trace>ntdll!RtlUserThreadStart+34 ( @ 0)</avrf:trace>
            </avrf:stackTrace>
        </avrf:logEntry>
    </avrf:logSession>
</avrf:logfile>

P.S. - One thing that's strange to me is that the stacktrace shows a call to SSL_use_PrivateKey_ASN1. Not sure why, because I'm only every calling SSL_use_cert and SSL_use_prvkey.Could it be that use_cert is trying to extract the private key from the cert? Just had that thought, I'm going to investigate Update - Nope. It's not possible to add private keys to X509 structures and for good security reasons. Was a desperate, not-thought-out idea.

  • "_the correct way is to use a global context for creating SSL objects_" - where did you gather this from? – Michael Foukarakis Apr 06 '15 at 06:29
  • @MichaelFoukarakis - I'll try and go back and pull my sources. Was a SO answer that posted a link to documentation with a pretty clear quote. However, I questioned if this logic applied to serving as the middle man for multiple dynamic hosts. Like I said please correct me if I'm wrong, I want to be corrected. What I was hoping to accomplish with this route was avoid reloading ca-bundle for upstream verification on every single upstream connection. –  Apr 06 '15 at 06:43
  • @MichaelFoukarakis after you made me question this (thanks), I went back and did some more reading. It now seems that I was completely wrong and I need to create a new context per host that I'm serving on behalf of and store/retrieve these contexts. Am I correct? If so, what about the situation where I have two clients simultaneously connecting to the same host and hence requiring the same context? Is it safe to use the same context from multiple threads? Of course I'll be searching for this information myself, I just thought I'd ask here as well. Thanks again. –  Apr 06 '15 at 06:59
  • 1
    In the context of a proxy, I have used one "default" `SSL_CTX` (may not be essential in your case), and one per server. It makes sense because a context is conceptually grouping connections under the same subset of parameters (e.g. ciphers etc). See [this related question](http://stackoverflow.com/questions/27649641/does-openssl-allow-multiple-ssl-ctx-per-process-one-ssl-ctx-used-for-server-ses). – Michael Foukarakis Apr 06 '15 at 07:03
  • @MichaelFoukarakis thanks for your comments. I believe after you've enlightened me here that the issue is in fact how I'm generating server contexts (or rather failing to) combined with not implementing the locking callbacks. I'm going to do a pretty extensive refactor and report back. –  Apr 06 '15 at 08:01
  • @MichaelFoukarakis Not creating a context for each host was definitely the problem. I seem to now be encountering some issues still related to locking in my shared client context but that's another issue. Your advice was correct. If you'd like to post an answer I can accept, please do. If not, I'll post one when I can. Thanks again. –  Apr 06 '15 at 14:01
  • I've managed to get away from my heap corruption, but now on both client and server contexts I'm randomly getting "decryption failed or bad record mac". Yay. –  Apr 08 '15 at 12:01
  • 1
    That's a specific enough problem for a new question. ;) – Michael Foukarakis Apr 08 '15 at 12:29
  • @TechnikEmpire I you realize what fixed the problem you might want to provide the answer in a few lines of text for the benefit of googlers/future users (the double free + context is quite search engine friendly) – sehe Apr 08 '15 at 20:19

2 Answers2

2

I was plagued by this, and I thank you for your follow up answer.

My solution, was instead of deleting the ssl::context instance in the destructor of what created it, I posted the delete to the main io_service:

Something like:

MyThing::~MyThing() {
    boost::asio::ssl::context *c = ssl_context_;
    socket_.get_io_service().post([c](){ delete c; });
}

that seemed to cure it just fine for me. (I use new/delete because I only create it as needed).

I think that there must be a more deterministic way of doing this, maybe shared_ptr but I haven't worked it out yet.

Sam Liddicott
  • 1,265
  • 12
  • 24
  • All the more interesting because it is wrong :-( The idea works, but this expression was faulty. Debug output showed that "nil" was being freed - I got my lambda bindings wrong. I've edited the answer with one that worked, but I'm sure there must be a neater expression. – Sam Liddicott Oct 14 '15 at 13:29
  • Makes me wonder if it's an issue with a lack of synchronization. I solved this issue by maintaining a refcount above 0 until the end of the program lifecycle, but this answer makes me believe I've just avoided the issue, not solved it. –  Oct 14 '15 at 16:30
  • I wonder if this is totally safe, as it might still get deleted concurrently but just in a different worker thread. – Sam Liddicott Oct 21 '15 at 16:07
0

So, Michael Foukarakis helped me question some of my assumptions in the comments which ultimately led to me solving this issue, so full credits to him. Here is where I went wrong in my approach and how I resolved it.

As mentioned in the question, I was originally creating a new SSL_CTX (boost::asio::ssl::context) object for every single proxy connection: one for the downstream, one up. The upstream, being the proxy acting as a client, was having boost::asio::ssl::context::load_verify_file("ca-bundle.crt") on it's context during initialization, which was leading to massive bloat in the application's ram consumption.

The second part of the original problem was that I was creating a new SSL_CTX for each downstream connection, the connection to our client where we're serving a spoofed certificate and handshaking pretending to be the origin server. Each new connection "bridge" is handed a reference to a "CertStore" object on construction that is designed to spoof, store and retrieve certificates and their keypairs, indexed by the host name.

So what we had going on was a central place holding certs and keys, but each individual SSL_CTX object being assigned these certs and keys and then destroyed when the connection closed, which eventually, "randomly" would lead to a double-free somewhere since openSSL uses internal reference counting on (most?) objects. These references would be increased and decreased by SSL_CTX's being created, having SSL_CTX_use_certificate and SSL_CTX_use_PrivateKey being called on them, then destroyed. Eventually two SSL_CTX's dying would be holding a reference to the same cert or keypair with a reference count of 1, leading to a double free of the memory when both contexts finally died.

The solution is to use a single shared context that you keep safe and alive somewhere until application shutdown for your upstream (client) connections. Call boost::asio::ssl::context::load_verify_file("ca-bundle.crt") on that single context once, then have all client SSL-Object spawn from it. As for server contexts, create one server context per-host that you're acting as, set the cert and private key of that context, and then share that context between all downstream SSL* sockets that act on behalf of that host.

  • I'm debugging an apparent double-free in SSL context under boost.asio, but I have but one question regarding this passage `two SSL_CTX's dying would be holding a reference to the same cert or keypair with a reference count of 1` how is it at all possible that TWO shared objects could have a reference count of 1? – Berkus Jan 17 '18 at 18:07