5

Everything I've read in the Boost ASIO docs and here on StackOverflow suggests I can stop an async_accept operation by calling close on the acceptor socket. However, I get an intermittent not_socket error in the async_accept handler when I try to do this. Am I doing something wrong or does Boost ASIO not support this?

(Related questions: here and here.)

(Note: I'm running on Windows 7 and using the Visual Studio 2015 compiler.)

The core problem I face is a race condition between the async_accept operation accepting an incoming connection and my call to close. This happens even when using a strand, explicit or implicit.

Note my call to async_accept strictly happens before my call to close. I conclude the race condition is between my call to close and the under-the-hood code in Boost ASIO that accepts the incoming connection.

I've included code demonstrating the problem. The program repeatedly creates an acceptor, connects to it, and immediately closes the acceptor. It expects the async_accept operation to either complete successfully or else be canceled. Any other error causes the program to abort, which is what I'm seeing intermittently.

For synchronization the program uses an explicit strand. Nevertheless, the call to close is unsynchronized with the effect of the async_accept operation, so sometimes the acceptor closes before it accepts the incoming connection, sometimes it closes afterward, sometimes neither—hence the problem.

Here's the code:

#include <algorithm>
#include <boost/asio.hpp>
#include <cstdlib>
#include <future>
#include <iostream>
#include <memory>
#include <thread>

int main()
{
  boost::asio::io_service ios;
  auto work = std::make_unique<boost::asio::io_service::work>(ios);

  const auto ios_runner = [&ios]()
  {
    boost::system::error_code ec;
    ios.run(ec);
    if (ec)
    {
      std::cerr << "io_service runner failed: " << ec.message() << '\n';
      abort();
    }
  };

  auto thread = std::thread{ios_runner};

  const auto make_acceptor = [&ios]()
  {
    boost::asio::ip::tcp::resolver resolver{ios};
    boost::asio::ip::tcp::resolver::query query{
      "localhost",
      "",
      boost::asio::ip::resolver_query_base::passive |
      boost::asio::ip::resolver_query_base::address_configured};
    const auto itr = std::find_if(
      resolver.resolve(query),
      boost::asio::ip::tcp::resolver::iterator{},
      [](const boost::asio::ip::tcp::endpoint& ep) { return true; });
    assert(itr != boost::asio::ip::tcp::resolver::iterator{});
    return boost::asio::ip::tcp::acceptor{ios, *itr};
  };

  for (auto i = 0; i < 1000; ++i)
  {
    auto acceptor = make_acceptor();
    const auto saddr = acceptor.local_endpoint();

    boost::asio::io_service::strand strand{ios};
    boost::asio::ip::tcp::socket server_conn{ios};

    // Start accepting.
    std::promise<void> accept_promise;
    strand.post(
      [&]()
    {
      acceptor.async_accept(
        server_conn,
        strand.wrap(
          [&](const boost::system::error_code& ec)
          {
            accept_promise.set_value();
            if (ec.category() == boost::asio::error::get_system_category()
              && ec.value() == boost::asio::error::operation_aborted)
              return;
            if (ec)
            {
              std::cerr << "async_accept failed (" << i << "): " << ec.message() << '\n';
              abort();
            }
          }));
    });

    // Connect to the acceptor.
    std::promise<void> connect_promise;
    strand.post(
      [&]()
    {
      boost::asio::ip::tcp::socket client_conn{ios};
      {
        boost::system::error_code ec;
        client_conn.connect(saddr, ec);
        if (ec)
        {
          std::cerr << "connect failed: " << ec.message() << '\n';
          abort();
        }
        connect_promise.set_value();
      }
    });
    connect_promise.get_future().get();   // wait for connect to finish

    // Close the acceptor.
    std::promise<void> stop_promise;
    strand.post([&acceptor, &stop_promise]()
    {
      acceptor.close();
      stop_promise.set_value();
    });
    stop_promise.get_future().get();   // wait for close to finish
    accept_promise.get_future().get(); // wait for async_accept to finish
  }

  work.reset();
  thread.join();
}

Here's the output from a sample run:

async_accept failed (5): An operation was attempted on something that is not a socket

The number in parentheses denotes how many successfully iterations the program ran.

UPDATE #1: Based on Tanner Sansbury's answer, I've added a std::promise for signaling the completion of the async_accept handler. This has no effect on the behavior I'm seeing.

UPDATE #2: The not_socket error originates from a call to setsockopt, from call_setsockopt, from socket_ops::setsockopt in the file boost\asio\detail\impl\socket_ops.ipp (Boost version 1.59). Here's the full call:

socket_ops::setsockopt(new_socket, state,
  SOL_SOCKET, SO_UPDATE_ACCEPT_CONTEXT,
  &update_ctx_param, sizeof(SOCKET), ec);

Microsoft's documentation for setsockopt says about SO_UPDATE_ACCEPT_CONTEXT:

Updates the accepting socket with the context of the listening socket.

I'm not sure what exactly this means, but it sounds like something that fails if the listening socket is closed. This suggests that, on Windows, one cannot safely close an acceptor that is currently running a completion handler for an async_accept operation.

I hope someone can tell me I'm wrong and that there is a way to safely close a busy acceptor.

Community
  • 1
  • 1
Craig M. Brandenburg
  • 3,354
  • 5
  • 25
  • 37

1 Answers1

4

The example program will not cancel the async_accept operation. Once the connection has been established, the async_accept operation will be posted internally for completion. At this point, the operation is no longer cancelable and is will not be affected by acceptor.close().

The issue being observed is the result of undefined behavior. The program fails to meet a lifetime requirement for async_accept's peer parameter:

The socket into which the new connection will be accepted. Ownership of the peer object is retained by the caller, which must guarantee that it is valid until the handler is called.

In particular, the peer socket, server_conn, has automatic scope within the for loop. The loop may begin a new iteration while the async_accept operation is outstanding, causing server_conn to be destroyed and violate the lifetime requirement. Consider extending server_conn's lifetime by either:

  • set a std::future within the accept handler and wait on the related std::promise before continuing to the next iteration of the loop
  • managing server_conn via a smart pointer and passing ownership to the accept handler
Tanner Sansbury
  • 51,153
  • 9
  • 112
  • 169
  • I've corrected my program to use a `promise` to synchronize the `async_accept` handler to happen before the end of the `for` loop. However, I still see the same “not a socket” error happening. – Craig M. Brandenburg Oct 16 '15 at 17:33
  • I should clarify that I understand and am OK with the inherent race between the `async_accept` operation finishing successfully and my attempt to close the acceptor. What I want is for the `async_accept` to either (1) finish successfully or else (2) be canceled. By the way, thanks for answering my question. I've used many of your other answers here on StackOverflow. – Craig M. Brandenburg Oct 16 '15 at 17:39
  • @CraigM.Brandenburg Glad to be of help. Closing the acceptor doesn't guarantee that outstanding `async_accept` operations will be cancelled. The operation must be in a cancelable state and cancellation must be supported by the underlying system calls. The `async_accept` operation uses [`AccceptEx()`](https://msdn.microsoft.com/en-us/library/windows/desktop/ms737524(v=vs.85).aspx), which does not populate the peer socket with all connection details. Asio is attempting to set connection details on socket whose connection has been terminated when completing the async_accept operation. – Tanner Sansbury Oct 16 '15 at 20:27
  • @CraigM.Brandenburg As long as the thread safety requirement and object lifetime guarantees are met, then one can safely close the acceptor. The `async_accept` operation completing with an error does not imply that the acceptor was closed in an unsafe manner. – Tanner Sansbury Oct 16 '15 at 20:42
  • Can the `setsockopt` call for `SO_UPDATE_ACCEPT_CONTEXT` (see update #2 in my question, above) race against another thread in the same process creating a socket with the same socket number as the recently closed socket's number? [E.g., (step 1) thread A closes acceptor socket 42; (step 2) thread B creates a new socket, reusing socket number 42; (step 3) thread A calls `setsockopt` on socket 42.] If so, what happens? Would the system populate the peer socket with the connection details of the newly created socket 42? This scenario may be unlikely, but I want to exclude _all_ races. – Craig M. Brandenburg Oct 16 '15 at 21:51
  • @CraigM.Brandenburg As far as I know, it will not have a data race. Regardless, the kernel would be responsible for returning the appropriate status if the call succeeds or fails. Based on the `AcceptEx()` documentation, I would expect `setsockopt` to fail, as the newly created socket would not be in the expected default state for a connected socket, and if it does succeed, then the `bind` will fail. Also, the connection has been terminated, so operations on the socket would fail. – Tanner Sansbury Oct 17 '15 at 22:02