I'm currently writing a very simple web server to learn more about low level socket programming. More specifically, I'm using C++ as my main language and I am trying to encapsulate the low level C system calls inside C++ classes with a more high level API.
I have written a Socket
class that manages a socket file descriptor and handles opening and closing using RAII. This class also exposes the standard socket operations for a connection oriented socket (TCP) such as bind, listen, accept, connect etc.
After reading the man pages for the send and recv system calls I realized that I needed to call these functions inside some form of loop in order to guarantee that all bytes are successfully sent/received.
My API for sending and receiving looks similar to this
void SendBytes(const std::vector<std::uint8_t>& bytes) const;
void SendStr(const std::string& str) const;
std::vector<std::uint8_t> ReceiveBytes() const;
std::string ReceiveStr() const;
For the send functionality I decided to use a blocking send
call inside a loop such as this (it is an internal helper function that works for both std::string and std::vector).
template<typename T>
void Send(const int fd, const T& bytes)
{
using ValueType = typename T::value_type;
using SizeType = typename T::size_type;
const ValueType *const data{bytes.data()};
SizeType bytesToSend{bytes.size()};
SizeType bytesSent{0};
while (bytesToSend > 0)
{
const ValueType *const buf{data + bytesSent};
const ssize_t retVal{send(fd, buf, bytesToSend, 0)};
if (retVal < 0)
{
throw ch::NetworkError{"Failed to send."};
}
const SizeType sent{static_cast<SizeType>(retVal)};
bytesSent += sent;
bytesToSend -= sent;
}
}
This seems to work fine and guarantees that all bytes are sent once the member function returns without throwing an exception.
However, I started running into problems when I began implementing the receive functionality. For my first attempt I used a blocking recv
call inside a loop and exited the loop if recv
returned 0 indicating that the underlying TCP connection was closed.
template<typename T>
T Receive(const int fd)
{
using SizeType = typename T::size_type;
using ValueType = typename T::value_type;
T result;
const SizeType bufSize{1024};
ValueType buf[bufSize];
while (true)
{
const ssize_t retVal{recv(fd, buf, bufSize, 0)};
if (retVal < 0)
{
throw ch::NetworkError{"Failed to receive."};
}
if (retVal == 0)
{
break; /* Connection is closed. */
}
const SizeType offset{static_cast<SizeType>(retVal)};
result.insert(std::end(result), buf, buf + offset);
}
return result;
}
This works fine as long as the connection is closed by the sender after all bytes have been sent. However, this is not the case when using e.g. Chrome to request a webpage. The connection is kept open and my receive member function is stuck blocked on the recv
system call after receiving all bytes in the request. I managed to get around this problem by setting a timeout on the recv
call using setsockopt. Basically, I return all bytes received so far once the timeout expires. This feels like a very inelegant solution and I do not think that this is the way web servers handles this issue in reality.
So, on to my question.
How does a web server know when an HTTP request have been fully received?
A GET
request in HTTP 1.1 does not seem to include a Content-Length header. See e.g. this link.