-3

So i'm currently working on a app that uses socket, for now everything is working, i just want to know if there is a more efficient way to send the length of the data and the data instead of sending twice.

i.e: Sending everything in one string.

void SendPackage(const SOCKET sock, const std::string package)
{
    int length = package.lenth();

    send(sock, std::to_string(length).c_str(), 10, 0);     //Buffer length is 10 assuming data length
    send(sock, package.c_str(), length, 0);        //will never be greater than 9,999,999,999
}

void ReceivePackage(const SOCKET sock, std::string &package, int bufferLength)
{
    std::vector<char> recvBuffer(10);
    int length, bytesProcessed;

    recv(sock, &recvBuffer[0], 10, 0); //Receiving length
    length = atoi(&recvBuffer[0]);
    recvBuffer.resize(length);

    for (int i = 0; i < length; i += bytesProcessed)
    {
        bytesProcessed = recv(sock, &recvBuffer[0] + i, bufferLength, 0);
        if (bytesProcessed < 0) break;
    }

    package = &recvBuffer[0];
}
  • 1
    Related, this is much easier if you learn [how htonl and family work](https://stackoverflow.com/questions/11423338/same-output-for-htonl-and-ntohl-on-an-integer), and just send the length (encoded, then decoded) – WhozCraig Aug 29 '18 at 21:54
  • There is not a 1:1 correspondence between send and recv in TCP. You need to keep calling recv until you either get the bytes you expect or time out. – stark Aug 29 '18 at 21:56
  • 2
    Does `send(sock, std::to_string(length), 10, 0);` really compile? – Galik Aug 29 '18 at 21:57
  • `if (bytesProcessed > 0) break;` should be `if (bytesProcessed <= 0) return;` instead. And `package = &recvBuffer[0];` should be changed to `package = std::string(&recvBuffer[0], length);` – Remy Lebeau Aug 29 '18 at 21:58
  • adding on what @stark said, TCP is a stream of bytes. While you will eventually get all the bytes you sent, how quickly they come in and how many read calls will be required to get them all is not guaranteed at all. It's actually a somewhat complicated protocol to use because you have to constantly read into buffers and parse them at the right time. – xaxxon Aug 29 '18 at 21:59
  • And never use atoi() in either C or C++. –  Aug 29 '18 at 22:05

2 Answers2

0
send(sock, std::to_string(length).c_str(), 10, 0);

This is UB. You are telling it that the data has 10 bytes, but that's not correct. Apart from that there is no need to convert your int into a string and then send the string. Just sent the int, and to avoid any platform incompatibilities I recommend using a fixed size type:

std::int32_t length = package.length();
send(sock, &length, sizeof(length), 0);

On the receiving end you must make sure that you actually wait for all bytes to be there. You can't simply call recv and assume it gives you all the bytes you need. You need to call it in a loop and check it's return value, which tells you how many bytes you got. To receive the length, you need to read the appropriate amount of bytes into a buffer, then reinterpret that buffer as the type you chose (e.g. std::int32_t.

Additionally you should make sure that the endianness of your int is correct. It is a de facto standard to use big-endian for network data, but at the end of the day this is up to you. Not part of the standard (afaik), but available on common platforms are the helper functions htonl and ntohl, which stand for "host to network, long" and "network to host, long". They take and return a std::int32_t, and you can simply use them to be sure the endianness works for both sides.

Sender:

std::int32_t length = htonl(package.length());
send(sock, &length, sizeof(length), 0);

Receiver:

// after successfully reading all bytes for the length value into an adequately sized buffer:
std::int32_t length = ntohl(*reinterpret_cast<std::int32_t*>(buffer));
Max Vollmer
  • 8,412
  • 9
  • 28
  • 43
  • You should use *unsigned int* to remain portable. Also you can't just *reinterpret_cast* to an `std::uint32_t` from an arbitrary byte buffer. It is better to cast the `std::int32_t` to a `char*` and read directly into that. – Galik Aug 29 '18 at 23:17
  • @Galik Not quite sure why you consider using an unsigned type to be more portable. Especially when we consider that the receiving end could be written in another language that doesn't support unsigned types, like Java. The casting order seems to be more about personal preference, but I am eager to learn if there are technical reasons why it should be better. – Max Vollmer Aug 30 '18 at 08:24
  • Ok, yes on balance I do think you are right about this. The wording does suggest the `C` standard should be definitive in all respects - not just the declarations but their semantic mening. – Galik Aug 30 '18 at 10:46
-2

As far as the socket interface is concerned, you are not sending twice. The socket just takes everything you stuff in one end with one or more send calls and sends it out in its own time (I'm assuming you don't exceed the socket buffer with the first call, which is a fairly safe assumption). Similarly, you don't need to match your recv calls with the send calls. You could recv the whole lot at once and then break it up later, but really, no big gain to do so.

So it's really a wash whether you pack the string first and then send it, or call send twice. The time and RAM necessary to create a new string with the length concatenated to the start is not going to be more efficient than just called send twice.

However, you should definitely send the length as the int it comes in, not as a string. That way it will always occupy sizeof(int) bytes. Your buffer length of 10 will usually not be true, and result in reading off the end of the string. For instance:

send(sock, &length, sizeof(length), 0);

and

recv(sock, &length, sizeof(length), 0); //Receiving length
Heath Raftery
  • 3,643
  • 17
  • 34
  • 1
    Remember that `recv` may return having read only part of `length` from the socket. The `MSG_WAITALL` flag may help with this. Also `length` may have different sizes at the sender and the receiver. – user4581301 Aug 29 '18 at 22:17
  • Also different CPU architectures may store the `int` in a different a byte order in memory. I would stick to sending text if at all possible. – Galik Aug 29 '18 at 23:11
  • Also you can run into sign portability problems so you should really send *unsigned* numbers. – Galik Aug 29 '18 at 23:19
  • Heh, sounds like people solving their own problems instead of the OP's. Clearly `SendPackage` and `ReceivePackage` are running on the same architecture so endianess is irrelevant. Clearly the question is not about recv returning with no data so waiting is irrelevant (quite possibly this is handled outside the code we can see). The OP's question is answered, and as a bonus I've added some audience-appropriate tips on the code we can see. We all gotta start somewhere. – Heath Raftery Aug 30 '18 at 01:31
  • 1
    Endian and different sized integers (which may differ due to compiler and compiler options even if the target architecture does not) may be a sideshow in this question, but they are still things to watch out for. I wouldn't downvote over them unless there was some egregious abuse going on or the question was specifically about them. `recv` possibly not getting all of `length` in one shot is a fatal flaw, though, and has a relatively simple resolution. I can't think of a good reason not to address it. – user4581301 Aug 30 '18 at 17:20