Does mulithreaded http processing with boost asio require strands?

Question

In the boost asio documentation for strands it says:

Strands may be either implicit or explicit, as illustrated by the following alternative approaches:

...

Where there is a single chain of asynchronous operations associated with a connection (e.g. in a half duplex protocol implementation like HTTP) there is no possibility of concurrent execution of the handlers. This is an implicit strand.

...

However, in boost beast's example for a multithreaded asynchronous http server the boost::asio::ip::tcp::acceptor as well as each boost::asio::ip::tcp::socket get their own strand explicitly (see line 373 and 425). As far as I can see, this should not be necessary, since all of these objects are only ever going to be accessed in sequentially registered/running CompletionHandlers.¹ Precisely, a new async operation for one of these objects is only ever registered at the end of a CompletionHandler registered on the same object, making any object be used in a single chain of asynchronous operations.²

Thus, I'd assume that - despite of multiple threads running concurrently - strands could be omitted all together in this example and the io_context may be used for scheduling any async operation directly. Is that correct? If not, what issues of synchronization am I missing? Am I misunderstanding the statement in the documentation above?

¹: Of course, two sockets or a socket and the acceptor may be worked with concurrently but due to the use of multiple stands this is not prevented in the example either.

²: Admittedly, the CompletionHandler registered at the end of the current CompletionHandler may be started on another thread before the current handler actually finished, i. e. returns. But I would assume that this is not a circumstance risking synchronization problems. Correct me, if I am wrong.

sehe · Accepted Answer · 2022-03-13T18:48:34.317

4

If the async chain of operations creates a logical strand, then often you don't need explicit strands.

Also, if the execution context is only ever run/polled from a single thread then all async operations will effective be on that implicit strand.

The examples serve more than one purpose.

On the one hand. they're obviously kept simple. Naturally there will be minimum number of threads or simplistic chains of operations.
However, that leads to over-simplified examples that have too little relation to real life.

Therefore, even if it's not absolutely required, the samples often show good practice or advanced patterns. Sometimes (often IME) this is even explicitly commented. E.g. in your very linked example L277:

 // We need to be executing within a strand to perform async operations
 // on the I/O objects in this session. Although not strictly necessary
 // for single-threaded contexts, this example code is written to be
 // thread-safe by default.
 net::dispatch(stream_.get_executor(),
               beast::bind_front_handler(
                   &session::do_read,
                   shared_from_this()));

Motivational example

This allows people to solve their next non-trivial task. For example, imagine you wanted to add stop() to the listener class from the linked example. There's no way to do that safely without a strand. You would need to "inject" a call to acceptor.cancel() inside the logical "strand", the async operation chain containing async_accept. But you can't, because async_accept is "logically blocking" that chain. So you actually do need to post to an explicit strand:

void stop() {
  post(acceptor_.get_executor(), [this] { acceptor_.cancel(); });
}

edited Mar 13 '22 at 18:48

answered Mar 13 '22 at 18:27

sehe

374,641
47
450
633

2

Added motivational example why e.g. the acceptor strand becomes *crucial* the minute people want to extend the sample with the simplest of features. – sehe Mar 13 '22 at 18:39
Thanks that's making things very clear. I just wonder about your example: What do you mean by "logically blocking"? I assume you mean if I would try to `cancel()` from outside the chain of async handlers (inside seems unproblematic to me), it may always interfere with the `acceptor`-usage from within the chain of handlers. So the chain of handlers only "block" the acceptor in an implicit(/logical ;)) way (since it may not be used concurrently from somewhere else - in this case the `close()`-call). – Reizo Mar 13 '22 at 19:35
By the way do the same principles hold for any shared state across handlers and threads? I. e. if I used an `int` in multiple handlers and threads, I would not need any synchronization as long as the usage within a handler always happens before that handler schedules its *single* next async operation? And if I *did* use that `int` *after* scheduling the single next async operation, I'd then need a `strand`, even though there is only somewhat of a single chain of handlers, since the next handler may actually access that `int` concurrently from another thread, as soon as it is scheduled. – Reizo Mar 13 '22 at 19:41
1

@Reizo calling `cancel()` from outside the chain of async handlers can **only** be done safely (without a data race) when you synchronize access to the IO objects involved (i.e. `acceptor_`). Of course there are other ways to achieve it than using strands, but strands are *the typical* route in Asio (and also completely zero-overhead if you run in a single-threaded context) – sehe Mar 13 '22 at 19:43
1

And yes I said _"logically blocking"_ (with the quotes) just to avoid confusion that `async_accept` would block. If you look at the async operation chain, though, logically that is blocked in the accept. Incidentally, if you write this in coroutine style, you will see it exactly that way in the code https://github.com/boostorg/beast/blob/master/example/http/server/coro/http_server_coro.cpp#L333 – sehe Mar 13 '22 at 19:46
(I don't see a lot of conceptual difference between sharing of the `int` vs. sharing of the IO object itself. Both are subject to the exact same ramifications: they're non-threadsafe instances being shared.) – sehe Mar 13 '22 at 19:49
Okay, so to go from the `int` back to the `acceptor` (I want to ask about this point again to be exactly certain): If I called `acceptor.cancel()` just after `acceptor.async_accept(...)` in my handler (regardless of how reasonable that would be) it would cause a data race since the *just scheduled* handler may already be started on another thread, itself calling `acceptor.async_accept(...)` concurrently to the not-yet-finshed *current* handler calling `acceptor.cancel()`. A call to `cancel` *before* the `async_accept`-call (again disregarding rationality) would be alright. Is that true? – Reizo Mar 13 '22 at 20:14
1

Unless both the initiation and the completion are on the strand, because that would mean the completion cannot occur before the cancel has already been invoked. Without the strand, yes, there's a data race in the first scenario – sehe Mar 13 '22 at 20:38

Does mulithreaded http processing with boost asio require strands?

1 Answers1

Motivational example

Linked