Should I clean up beast::flat_buffer when I see errors on on_read?

Question

class session : public std::enable_shared_from_this<session>
{
    ...
    beast::flat_buffer buffer_; // (Must persist between reads)
    http::response<http::string_body> res_;
    ...
}

void on_write(beast::error_code ec, std::size_t bytes_transferred) {

  if (ec)
  {
    fail(ec, "write");
    return try_again();    
  }    

  // Receive the HTTP response
  http::async_read(
      stream_, buffer_, res_,
      beast::bind_front_handler(&session::on_read, shared_from_this()));
}

void on_read(beast::error_code ec, std::size_t bytes_transferred) {

  if (ec)
  {
    fail(ec, "read");
    return try_again();  
  }    

  // Step 1: process response
  //
  const auto &body_data = res_.body().data();
  user_parse_data(net::buffers_begin(body_data), net::buffers_end(body_data));
  
  // Step 2: clean up buffer_
  //
  buffer_.consume(buffer_.size()); // clean up buffer_ after finishing reading it

  // Step 3: continue to write
  ...
}

In the above implementation, I ONLY clean up the buffer_ when I finish parsing the data successfully.

Question> Should I clean up the buffer_ when I experience an error on the on_read too?

void on_read(beast::error_code ec, std::size_t bytes_transferred) {

  if (ec)
  {
    // clean up buffer_
    buffer_.consume(buffer_.size()); // Should we do the cleanup here too?
    
    fail(ec, "read");
    return try_again();    
  }    

  // Step 1: process response
  //
  const auto &body_data = res_.body().data();
  user_parse_data(net::buffers_begin(body_data), net::buffers_end(body_data));
  
  // Step 2: clean up buffer_
  //
  buffer_.consume(buffer_.size());

  // Step 3: continue to write
  ...
}

sehe · Accepted Answer · 2022-06-14T21:40:29.210

2

// Should we do the cleanup here too?

That's asking the wrong question entirely.

One obvious question that comes first is "should we cleanup the read buffer at all".

And the more important question is: what do you do with the connection?

The buffer belongs to the connection, as it represents stream data.

The example you link always closes the connection. So the buffer is irrelevant after receiving the response - since the connection becomes irrelevant. Note that the linked example doesn't consume on the buffer either.

Should You Cleanup At All?

You should not cleanup after http::read!

The reason is that http::read already consumes any data that was parsed as part of the response message.

Even if you expect to read more messages from the same connection (e.g. HTTP pipelining), you need to start the next http::read with the same buffer since it might already contain (partial) data for the subsequent message.

What About Errors?

If you have an IO/parse error during HTTP transmissions, I expect in most circumstances the HTTP protocol specification will require you to shut down the connection.

There is no "try_again" in HTTP/1. Once you've lost the thread on stream contents, there is no way you can recover to a "known state".

Regardless, I'd always recommend shutting down failed HTTP sessions, because not doing so opens up to corrupted messages and security vulnerabilities.

edited Jun 14 '22 at 21:40

answered Jun 14 '22 at 21:34

sehe

374,641
47
450
633

My application is written based on the linked example. The basic workflow is to write a query and then read a response and continue to do so. The reason I call buffer_.consume(..) is that it was called in this example https://github.com/vinniefalco/CppCon2018/blob/master/websocket_session.cpp (i.e. Line 73) and I incorrectly thought I have to do the same here. – q0987 Jun 15 '22 at 02:03
1

So far I didn't see issues even if I should not call buffer_.consume(..) and I guess the main reason is that the http::read already consumes all buffered data so the buffer_consume(...) in fact is a no-op(i.e. buffer_.size() == 0). In a follow-up question, when I restart a HTTP connection, should I clean up all residue data within buffer_ by calling buffer_.consume(buffer_.size()). – q0987 Jun 15 '22 at 02:03
1

Yeah, the other example is using websockets, where `read` actually reads into the buffer. `http::read` reads into a buffer in order to parse a message, indeed consuming the parsed data. – sehe Jun 15 '22 at 11:37
1

To the followup, indeed if you start a new connection, you need to start with an empty buffer. That's what I implied with _"The buffer belongs to the connection, as it represents stream data"_ – sehe Jun 15 '22 at 11:38
https://www.boost.org/doc/libs/1_79_0/libs/beast/doc/html/beast/ref/boost__beast__flat_buffer.html flat_buffer::clear(Set the size of the readable and writable bytes to zero) flat_buffer::consume(Remove bytes from beginning of the readable bytes) To start a new connection, it seems that flat_buffer::clear is better than flat_buffer::consume which ONLY removes readable bytes. – q0987 Jun 15 '22 at 13:15
1

I usually organize my code so that the buffer lives with the connection. Starting a new connection automatically gets you a fresh buffer, because they're part of the same object. Mutable state is the root of almost all software bugs. – sehe Jun 15 '22 at 13:22

Should I clean up beast::flat_buffer when I see errors on on_read?

1 Answers1

Should You Cleanup At All?

What About Errors?