I have just started working with TCP (and all associated libraries) due to the need to implement communication between two processes over an internet connection. My code works but it is very slow compared to what I (perhaps due to lack of experience) would expect given the network latency and bandwidth. Also, I'm sure there are many other things wrong with the code, which is using the UNIX socket API. I would prefer not to use big libraries (such as Boost) for my project unless there is a very good reason.
I include a minimal working example. It is rather long despite my best efforts to shorten it. However, I think most of the problems should be in the first file (tcp_helpers.h) which is only used by the client and server main programs in a fairly obvious way. The functions there are not fully optimized but I find it hard to believe that is the problem, rather likely are some fundamental flaws in the logic.
I also want to ask some questions relevant to the problem:
- For network performance, should I worry about using IPv4 vs IPv6? Could it be that my network dislikes the use of IPv4 somehow and penalized performance?
- Since the Socket API emulates a stream, I would think it does not matter if you call send() multiple times on smaller chunks of data or once on a big chunk. But perhaps it does matter and doing it with smaller chunks (I call send for my custom protocol header and the data separately each time) leads to issues?
- Suppose that two parties communicate over a network doing work on the received data before sending their next message (as is done in my example). If the two processes take x amount of time on localhost to finish, they should never take longer than (2*x + (network overhead)) on the real network, right? If x is small, making the computations (i.e. work before sending next message) go faster will not help, right?
- My example program takes about 4ms when running on localhost and >0.7 seconds when running on the local (university) network I'm using. The local network has ping times (measured with
ping
) of ( min/avg/max/mdev [ms] = 4.36 / 97.6 / 405. / 86.3 ) and a bandwidth (measured withiperf
) of ~70Mbit/s. When runnin the example program on the network I get (measured withwireshark
filtering on the port in question) 190 packets with an average throughput of 172kB/s and average packet size ~726 Bytes. Is this realistic? To me it seems like my program should be much faster given these network parameters, despite the fairly high ping time. - Looking at the actual network traffic generated by the example program, I started thinking about all the "features" of TCP that are done under the hood. I read somewhere that many programs use several sockets at the same time "to gain speed". Could this help here, for example using two sockets, each for just one-way communication? In particular, maybe somehow reducing the number of ack packets could help performance?
- The way I'm writing messages/headers as structs has (at least) two big problems that I already know. First, I do not enforce network byte order. If one communicating party uses big-endian and the other little-endian, this program will not work. Furthermore, due to struct padding (see catb.org/esr/structure-packing/), the sizes of the structs may vary between implementations or compilers, which would also break my program. I could add something like (for gcc)
__attribute__((__packed__))
to the structs but that would make it very compiler specific and perhaps even lead to inefficiency. Are there standard ways of dealing with this issue (I've seen something about aligning manually)? (Maybe I'm looking for the wrong keywords.)
// tcp_helpers.h. // NOTE: Using this code is very ill-advised.
#include <iostream>
#include <string>
#include <sstream>
#include <vector>
#include <unistd.h> // POSIX specific
#include <sys/socket.h> // POSIX specific
#include <netinet/in.h> // POSIX specific
#include <arpa/inet.h> // POSIX specific
#include <cerrno> // for checking socket error messages
#include <cstdint> // for fixed length integer types
//////////////////// PROFILING ///////////////////
#include <chrono>
static auto start = std::chrono::high_resolution_clock::now();
void print_now(const std::string &message) {
auto t2 = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> time_span = t2 - start;
std::cout << time_span.count() << ": " << message << std::endl;
}
//////////////////// PROFILING ///////////////////
struct TCPMessageHeader {
uint8_t protocol_name[4];
uint32_t message_bytes;
};
struct ServerSends {
uint16_t a;
uint32_t b;
uint32_t c;
};
typedef uint8_t ClientSends;
namespace TCP_Helpers {
template<typename NakedStruct>
void send_full_message(int fd, TCPMessageHeader header_to_send, const std::vector<NakedStruct> &structs_to_send) {
print_now("Begin send_full_message");
if (header_to_send.message_bytes != sizeof(NakedStruct) * structs_to_send.size()) {
throw std::runtime_error("Struct vector's size does not match the size claimed by message header");
}
int bytes_to_send = sizeof(header_to_send);
int send_retval;
while (bytes_to_send != 0) {
send_retval = send(fd, &header_to_send, sizeof(header_to_send), 0);
if (send_retval == -1) {
int errsv = errno; // from errno.h
std::stringstream s;
s << "Sending data failed (locally). Errno:" << errsv << " while sending header.";
throw std::runtime_error("Sending data failed (locally)");
}
bytes_to_send -= send_retval;
}
bytes_to_send = header_to_send.message_bytes;
while (bytes_to_send != 0) {
send_retval = send(fd, &structs_to_send[0], sizeof(NakedStruct) * structs_to_send.size(), 0);
if (send_retval == -1) {
int errsv = errno; // from errno.h
std::stringstream s;
s << "Sending data failed (locally). Errno:" << errsv <<
" while sending data of size " << header_to_send.message_bytes << ".";
throw std::runtime_error(s.str());
}
bytes_to_send -= send_retval;
}
print_now("end send_full_message.");
}
template<typename NakedStruct>
std::vector<NakedStruct> receive_structs(int fd, uint32_t bytes_to_read) {
print_now("Begin receive_structs");
unsigned long num_structs_to_read;
// ensure expected message is non-zero length and a multiple of the SingleBlockParityRequest struct
if (bytes_to_read > 0 && bytes_to_read % sizeof(NakedStruct) == 0) {
num_structs_to_read = bytes_to_read / sizeof(NakedStruct);
} else {
std::stringstream s;
s << "Message length (bytes_to_read = " << bytes_to_read <<
" ) specified in header does not divide into required stuct size (" << sizeof(NakedStruct) << ").";
throw std::runtime_error(s.str());
}
// vector must have size > 0 for the following pointer arithmetic to work
// (this method must check this in above code).
std::vector<NakedStruct> received_data(num_structs_to_read);
int valread;
while (bytes_to_read > 0) // todo need to include some sort of timeout?!
{
valread = read(fd,
((uint8_t *) (&received_data[0])) +
(num_structs_to_read * sizeof(NakedStruct) - bytes_to_read),
bytes_to_read);
if (valread == -1) {
throw std::runtime_error("Reading from socket file descriptor failed");
} else {
bytes_to_read -= valread;
}
}
print_now("End receive_structs");
return received_data;
}
void send_header(int fd, TCPMessageHeader header_to_send) {
print_now("Start send_header");
int bytes_to_send = sizeof(header_to_send);
int send_retval;
while (bytes_to_send != 0) {
send_retval = send(fd, &header_to_send, sizeof(header_to_send), 0);
if (send_retval == -1) {
int errsv = errno; // from errno.h
std::stringstream s;
s << "Sending data failed (locally). Errno:" << errsv << " while sending (lone) header.";
throw std::runtime_error(s.str());
}
bytes_to_send -= send_retval;
}
print_now("End send_header");
}
TCPMessageHeader receive_header(int fd) {
print_now("Start receive_header (calls receive_structs)");
TCPMessageHeader retval = receive_structs<TCPMessageHeader>(fd, sizeof(TCPMessageHeader)).at(0);
print_now("End receive_header (calls receive_structs)");
return retval;
}
}
// main_server.cpp
#include "tcp_helpers.h"
int init_server(int port) {
int server_fd;
int new_socket;
struct sockaddr_in address{};
int opt = 1;
int addrlen = sizeof(address);
// Creating socket file descriptor
if ((server_fd = socket(AF_INET, SOCK_STREAM, 0)) == 0) {
throw std::runtime_error("socket creation failed\n");
}
if (setsockopt(server_fd, SOL_SOCKET, SO_REUSEADDR | SO_REUSEPORT, &opt, sizeof(opt))) {
throw std::runtime_error("failed to set socket options");
}
address.sin_family = AF_INET;
address.sin_addr.s_addr = INADDR_ANY;
address.sin_port = htons(port);
// Forcefully attaching socket to the port
if (bind(server_fd, (struct sockaddr *) &address, sizeof(address)) < 0) {
throw std::runtime_error("bind failed");
}
if (listen(server_fd, 3) < 0) {
throw std::runtime_error("listen failed");
}
if ((new_socket = accept(server_fd, (struct sockaddr *) &address, (socklen_t *) &addrlen)) < 0) {
throw std::runtime_error("accept failed");
}
if (close(server_fd)) // don't need to listen for any more tcp connections (PvP connection).
throw std::runtime_error("closing server socket failed");
return new_socket;
}
int main() {
int port = 20000;
int socket_fd = init_server(port);
while (true) {
TCPMessageHeader rcv_header = TCP_Helpers::receive_header(socket_fd);
if (rcv_header.protocol_name[0] == 0) // using first byte of header name as signal to end
break;
// receive message
auto rcv_message = TCP_Helpers::receive_structs<ClientSends>(socket_fd, rcv_header.message_bytes);
for (ClientSends ex : rcv_message) // example "use" of the received data that takes a bit of time.
std::cout << static_cast<int>(ex) << " ";
std::cout << std::endl << std::endl;
// send a "response" containing 1000 structs of zeros
auto bunch_of_zeros = std::vector<ServerSends>(500);
TCPMessageHeader send_header{"abc", 500 * sizeof(ServerSends)};
TCP_Helpers::send_full_message(socket_fd, send_header, bunch_of_zeros);
}
exit(EXIT_SUCCESS);
}
// main_client.cpp
#include "tcp_helpers.h"
int init_client(const std::string &ip_address, int port) {
int sock_fd;
struct sockaddr_in serv_addr{};
if ((sock_fd = socket(AF_INET, SOCK_STREAM, 0)) < 0) {
throw std::runtime_error("TCP Socket creation failed\n");
}
serv_addr.sin_family = AF_INET;
serv_addr.sin_port = htons(port);
// Convert IPv4 address from text to binary form
if (inet_pton(AF_INET, ip_address.c_str(), &serv_addr.sin_addr) <= 0) {
throw std::runtime_error("Invalid address/ Address not supported for TCP connection\n");
}
if (connect(sock_fd, (struct sockaddr *) &serv_addr, sizeof(serv_addr)) < 0) {
throw std::runtime_error("Failed to connect to server.\n");
}
return sock_fd;
}
int main() {
// establish connection to server and get socket file descriptor.
int port = 20000;
int socket_fd = init_client("127.0.0.1", port);
for (int i = 0; i < 20; ++i) { // repeat sending and receiving random data
// send a message containing 200 structs of zeros
auto bunch_of_zeros = std::vector<ClientSends>(250);
TCPMessageHeader send_header{"abc", 250 * sizeof(ClientSends)};
TCP_Helpers::send_full_message(socket_fd, send_header, bunch_of_zeros);
// receive response
TCPMessageHeader rcv_header = TCP_Helpers::receive_header(socket_fd);
auto rcv_message = TCP_Helpers::receive_structs<ServerSends>(socket_fd, rcv_header.message_bytes);
for (ServerSends ex : rcv_message) // example "use" of the received data that takes a bit of time.
std::cout << ex.a << ex.b << ex.c << " ";
std::cout << std::endl << std::endl;
}
auto end_header = TCPMessageHeader{}; // initialized all fields to zero. (First byte of name == 0) is "end" signal.
TCP_Helpers::send_header(socket_fd, end_header);
exit(EXIT_SUCCESS);
}