1

Using the following boost::asio code I run a loop of 1M sequential http calls to a Docker node.js simple http service that generates random numbers, but after a few thousand calls I start getting async_connect errors. The node.js part is not producing any errors and I believe it works OK.

To avoid resolving the host in every call and trying to speed-up, I am caching the endpoint, which makes no difference, I have tested both ways.

Can anyone see what is wrong with my code below? Are there any best practices for a stress-test tool using asio that I am missing?

//------------------------------------------------------------------------------
// https://www.boost.org/doc/libs/1_70_0/libs/beast/doc/html/beast/using_io/timeouts.html

HttpResponse HttpClientAsyncBase::_http(HttpRequest&& req)
{
    using namespace boost::beast;
    namespace net = boost::asio;
    using tcp = net::ip::tcp;

    HttpResponse res;
    req.prepare_payload();
    boost::beast::error_code ec = {};

    const HOST_INFO host = resolve(req.host(), req.port, req.resolve);

    net::io_context m_io;

    boost::asio::spawn(m_io, [&](boost::asio::yield_context yield)
    {
        size_t retries = 0;

        tcp_stream stream(m_io);
        
        if (req.timeout_seconds == 0) get_lowest_layer(stream).expires_never();
        else get_lowest_layer(stream).expires_after(std::chrono::seconds(req.timeout_seconds));
        
        get_lowest_layer(stream).async_connect(host, yield[ec]);
        if (ec) return;

        http::async_write(stream, req, yield[ec]);
        if (ec)
        {
            stream.close();
            return;
        }

        flat_buffer buffer;
        http::async_read(stream, buffer, res, yield[ec]);

        stream.close();
    });

    m_io.run();

    if (ec)
        throw boost::system::system_error(ec);

    return std::move(res);
}

I have tried both sync/async implementations of a boost http client and I get the exact same problem.

The error I get is "You were not connected because a duplicate name exists on the network. If joining a domain, go to System in Control Panel to change the computer name and try again. If joining a workgroup, choose another workgroup name [system:52]"

Alan Birtles
  • 32,622
  • 4
  • 31
  • 60
Elias
  • 452
  • 5
  • 11
  • Why do you believe the node server isn't the problem? Have you tried monitoring the traffic to see where the problem lies? Have you checked the error code when connect fails? – Alan Birtles Oct 26 '22 at 20:46
  • Thank you, the error is "Exception: You were not connected because a duplicate name exists on the network. If joining a domain, go to System in Control Panel to change the computer name and try again. If joining a workgroup, choose another workgroup name [system:52]," – Elias Oct 26 '22 at 21:00
  • 2
    I'd guess you've run out of ports, there are a fixed number available for use – Alan Birtles Oct 26 '22 at 21:24
  • Alan thank you for taking the time to care for my problem. Do you mean that each time the http client connects to the server is using a local port, which is not freed when I close the socket? Do you have a suggestion or some pointers what to look into next? – Elias Oct 26 '22 at 21:29
  • 1
    See https://stackoverflow.com/questions/26019164/too-many-time-wait-connections-getting-cannot-assign-requested-address to see if this is what is happening – Tony Lee Oct 26 '22 at 21:35
  • Please don't edit the solution into your question. If you've solved your own problem then write that in an answer – Alan Birtles Oct 27 '22 at 07:12

3 Answers3

2

So, I decided to... just try. I made your code into self-contained example:

#include <boost/asio/spawn.hpp>
#include <boost/beast.hpp>
#include <fmt/ranges.h>
#include <iostream>
namespace http = boost::beast::http;
//------------------------------------------------------------------------------
// https://www.boost.org/doc/libs/1_70_0/libs/beast/doc/html/beast/using_io/timeouts.html
struct HttpRequest : http::request<http::string_body> { // SEHE: don't do this
    using base_type = http::request<http::string_body>;
    using base_type::base_type;

    std::string host() const { return "127.0.0.1"; }
    uint16_t    port    = 80;
    bool        resolve = true;

    int timeout_seconds = 0;
};
using HttpResponse = http::response<http::vector_body<uint8_t> >; // Do this or aggregation instead

struct HttpClientAsyncBase {
    HttpResponse _http(HttpRequest&& req);

    using HOST_INFO = boost::asio::ip::tcp::endpoint;
    static HOST_INFO resolve(std::string const& host, uint16_t port, bool resolve) {
        namespace net = boost::asio;
        using net::ip::tcp;

        net::io_context ioc;
        tcp::resolver   r(ioc);
        using flags = tcp::resolver::query::flags;

        auto f = resolve ? flags::address_configured
                         : static_cast<flags>(flags::numeric_host | flags::numeric_host);

        tcp::resolver::query q(tcp::v4(), host, std::to_string(port), f);

        auto it = r.resolve(q);
        assert(it.size());
        return HOST_INFO{it->endpoint()};
    }
};

HttpResponse HttpClientAsyncBase::_http(HttpRequest&& req) {
    using namespace boost::beast;
    namespace net = boost::asio;
    using net::ip::tcp;

    HttpResponse res;
    req.prepare_payload();
    boost::beast::error_code ec = {};

    const HOST_INFO host = resolve(req.host(), req.port, req.resolve);

    net::io_context m_io;

    spawn(m_io, [&](net::yield_context yield) {
        // size_t retries = 0;

        tcp_stream stream(m_io);

        if (req.timeout_seconds == 0)
            get_lowest_layer(stream).expires_never();
        else
            get_lowest_layer(stream).expires_after(std::chrono::seconds(req.timeout_seconds));

        get_lowest_layer(stream).async_connect(host, yield[ec]);
        if (ec)
            return;

        http::async_write(stream, req, yield[ec]);
        if (ec) {
            stream.close();
            return;
        }

        flat_buffer buffer;
        http::async_read(stream, buffer, res, yield[ec]);

        stream.close();
    });

    m_io.run();

    if (ec)
        throw boost::system::system_error(ec);

    return res;
}

int main() {
    for (int i = 0; i<100'000; ++i) {
        HttpClientAsyncBase hcab;
        HttpRequest         r(http::verb::get, "/bytes/10", 11);
        r.timeout_seconds = 0;
        r.port            = 80;
        r.resolve         = false;

        auto res = hcab._http(std::move(r));
        std::cout << res.base() << "\n";
        fmt::print("Data: {::02x}\n", res.body());
    }
}

(Side note, this is using docker run -p 80:80 kennethreitz/httpbin to run the server side)

While this is about 10x faster than running curl to do the equivalent requests in a bash loop, none of this is particularly stressing. There's nothing async about it, and it seems resource usage is mild and stable, e.g. memory profiled:

enter image description here

(for completeness I verified identical results with timeout_seconds = 1)

Since what you're doing is literally the opposite of async IO, I'd write it much simpler:

struct HttpClientAsyncBase {
    net::io_context m_io;

    HttpResponse _http(HttpRequest&& req);

    static auto resolve(std::string const& host, uint16_t port, bool resolve);
};

HttpResponse HttpClientAsyncBase::_http(HttpRequest&& req) {
    HttpResponse res;
    req.requestObject.prepare_payload();

    const auto host = resolve(req.host(), req.port, req.resolve);

    beast::tcp_stream stream(m_io);

    if (req.timeout_seconds == 0)
        stream.expires_never();
    else
        stream.expires_after(std::chrono::seconds(req.timeout_seconds));

    stream.connect(host);

    write(stream, req.requestObject);

    beast::flat_buffer buffer;
    read(stream, buffer, res);

    stream.close();

    return res;
}

That's just simpler, runs faster and does the same, down to the exceptions.

But, you're probably trying to cause stress, perhaps you instead need to reuse some connections and multi-thread?

You can see a very complete example of just that here: How do I make this HTTPS connection persistent in Beast?

It includes reconnecting dropped connections, connections to different hosts, varied requests etc.

sehe
  • 374,641
  • 47
  • 450
  • 633
2

Alan's comments gave me the right pointers and I soon found using netstat -a that it was a ports leakage problem with thousands of ports in TIME_WAIT state after running the code for some brief time.

The root cause was both on the client and the server:

  1. In node.js server I had to make sure that responses close the connection by adding

    response.setHeader("connection", "close");

  2. In boost::asio C++ code I replaced stream.close() with

    stream.socket().shutdown(boost::asio::ip::tcp::socket::shutdown_both, ec);

    That seems to make all the difference. Also I made sure to use

    req.set(boost::beast::http::field::connection, "close"); in my requests.

I verfied with the tool running for over 5 hours with no problems at all, so I guess the problem is solved!

Elias
  • 452
  • 5
  • 11
  • I appreciate you sharing the analysis. Especially the explicit finding that `shutdown` does improve the behaviour. Are you using SSL in your real code, by any chance? – sehe Oct 27 '22 at 11:12
  • Yes, I am. It needs async_shutdown on the ssl_stream. – Elias Oct 27 '22 at 11:30
  • That's a big difference, TLS shutdown does have bigger resource implications. Perhaps you should mention it in your answer. – sehe Oct 27 '22 at 11:46
0

Implementing 'Abortive TCP/IP Close' with boost::asio to treat EADDRNOTAVAIL and TIME_WAIT for HTTP client stress test tool

I am revisting the issue to offer an alternative that actually worked much better. Reminding that the objective was to develop a stress test tool for hitting a server with 1M requests. Even though my previous solution worked on Windows, when I loaded the executable on Docker/Alpine it started crashing with SEGFAULT errors that I was unable to trace. The root cause seems to be related to boost::asio::spawn(m_io, [&](boost::asio::yield_context yield) but time pressured me to solve the HTTP problem.

I decided to use synch HTTP and treat EADDRNOTAVAIL and TIME_WAIT errors by following suggestions from Disable TIME_WAIT with boost sockets and TIME_WAIT with boost asio and template code from https://www.boost.org/doc/libs/1_80_0/libs/beast/example/http/client/sync/http_client_sync.cpp.

For anyone having EADDRNOTAVAIL and TIME_WAIT with boost::asio, the solution that worked for me and it is actually much faster than before on both Windows, Linux and Dockers is the following:

HttpResponse HttpClientSyncBase::_http(HttpRequest&& req)
{
    namespace beast = boost::beast;
    namespace http = beast::http;
    namespace net = boost::asio;
    using tcp = net::ip::tcp;

    HttpResponse res;
    req.prepare_payload();

    const auto host = req.host();
    const auto port = req.port;
    const auto target = req.target();
    const bool abortive_close = boost::iequals(req.header("Connection"), "close");
    const bool download_large_file = boost::iequals(req.header("X-LARGE-FILE-HINT"), "YES"); 

    beast::error_code ec;
    net::io_context ioc;

    // Resolve host:port for IPv4
    tcp::resolver resolver(ioc);
    const auto endpoints = resolver.resolve(boost::asio::ip::tcp::v4(), host, port);

    // Create stream and set timeouts
    beast::tcp_stream stream(ioc);  
    if (req.timeout_seconds == 0) boost::beast::get_lowest_layer(stream).expires_never();
    else boost::beast::get_lowest_layer(stream).expires_after(std::chrono::seconds(req.timeout_seconds));

    // Caution: we can get address_not_available[EADDRNOTAVAIL] due to TIME_WAIT port exhaustion
    stream.connect(endpoints, ec);
    if (ec == boost::system::errc::address_not_available)
        throw beast::system_error{ ec };

    // Write HTTP request
    http::write(stream, req);

    // Read HTTP response (or download large file >8MB) 
    beast::flat_buffer buffer;
    if (download_large_file)
    {       
        _HttpResponse tmp;
        boost::beast::http::response_parser<boost::beast::http::string_body> parser{ std::move(tmp) };      
        parser.body_limit(boost::none);             
        boost::beast::http::read(stream, buffer, parser);       
        res = HttpResponse(std::move(parser.release()));
    }
    else
    {       
        http::read(stream, buffer, res);
    }

    // Try to shut down socket gracefully   
    stream.socket().shutdown(tcp::socket::shutdown_both, ec);

    if (abortive_close)
    {
        // Read until no more data are in socket buffers
        // https://stackoverflow.com/questions/58983527/disable-time-wait-with-boost-sockets
        try
        {
            http::response<http::dynamic_body> res;
            beast::flat_buffer buffer;
            http::read(stream, buffer, res);
        }
        catch (...)
        {
            // should get end of stream here, ignore it
        }

        // Perform "Abortive TCP/IP Close" to minimize TIME_WAIT port exhaustion
        // https://stackoverflow.com/questions/35006324/time-wait-with-boost-asio   
        try
        {
            // enable linger with timeout 0 to force abortive close
            boost::asio::socket_base::linger option(true, 0);
            stream.socket().set_option(option);
            stream.close();
        }
        catch (...)
        {
        }
    }
    else
    {
        try { stream.close(); } catch (...) {}      
    }

    // Ignore not_connected and end_of_stream errors, handle the rest
    if (ec && ec != beast::errc::not_connected && ec != beast::http::error::end_of_stream)
        throw beast::system_error{ ec };

    return std::move(res);
}

In the sample above I should add error handling in write but I guess anyone can do it. _HttpResponse is the following and is the base for HttpResponse.

using _HttpRequest = boost::beast::http::message<true, boost::beast::http::string_body, boost::beast::http::fields>;
using _HttpResponse = boost::beast::http::message<false, boost::beast::http::string_body, boost::beast::http::fields>;
using HttpHeaders = boost::beast::http::header<1, boost::beast::http::basic_fields<std::allocator<char>>>;

For what is worth, when I started the estimation for the job was 5-7 days. Using connetion=close in my previous solution it got down to 7-8 hours. Using Abortive TCP/IP Close I got down to 1.5 hours.

Funny thing is, the server, also boost::asio, could handle the stress while the original stress tool didn't. Finally both the server and its stress test tool work just fine! The code also demonstrates how to download a large file (over 8MB) which was another side-problem, as I needed to download the test results from the server.

Elias
  • 452
  • 5
  • 11