C++ Byte order in socket programming

Question

In C++ we send data using socket on the network. I am aware that we need to use htons() , ntohs() function to maintain byte order big endian and little endian.

support we have following data to be sent

int roll;
int id;
char name[100];

This can also be wrapped into struct.

My confusion here is, for roll and id, we can use htons() function. But for the string name, what should and how should we do it? Do we need to use any such function? will it work on every machine like mac, intel and other network?

I want to send all three fields in one packet.

score 7 · Answer 1 · answered Jun 30 '11 at 11:46

7

You'd use htonl for int, not htons.

The name doesn't need to be reordered, since the bytes of the array correspond directly to bytes on the network.

The issue of byte-order only arises with words larger than a byte, since different architectures choose different ends at which to place the least-significant byte.

answered Jun 30 '11 at 11:46

Marcelo Cantos

181,030
38
327
365

@marcelo: probably multibyte char? 0x0001 becomes 0x0100? – Donotalo Jun 30 '11 at 11:56
Marcelo Cantos looks like you are right. need some more info. thanks – Vijay Jun 30 '11 at 11:57
@Vijay: On all but the strangest of platforms, a `char` is the same size as a byte: eight bits. When you say "multibyte", I assume you are referring to wide characters, or `wchar_t`, which are usually 16 bits, but are sometimes 32 bits. Either way, it's pretty safe to assume that a `char` will encode as a single unit on any IP-based transport. If you want to send Unicode, it's usually best to transmit a UTF-8 encoding rather than wide characters. – Marcelo Cantos Jun 30 '11 at 12:01
@Marcelo Cantos - you are again right. but making me confused. Wide chars if i xmit without conversion will they work? utf8 in its own are like char array so it will work. confusing for me now – Vijay Jun 30 '11 at 12:10
Wide characters are larger than a byte, so they need to be byte-order corrected (`htons`) if you want your transmission format to be byte-order-neutral. As you point out, a UTF-8 encoding defines a character-level encoding (an array of bytes), so it has no such issues. – Marcelo Cantos Jun 30 '11 at 12:12

score 1 · Answer 2 · answered Jun 30 '11 at 11:48

1

For char arrays this conversion is not necessary since they do not have a network byte order but are sequentially transmitted. The reason that ntohs and htons exist, is that some data types consist of lesser and more significant bits, which are interpreted differently on different architectures. This is not the case in strings.

answered Jun 30 '11 at 11:48

Constantinius

34,183
8
77
85

actually, for those its byte ordering that is the issue. – diverscuba23 Jun 30 '11 at 11:49
1

You mean strings? No, why should it be? – Constantinius Jun 30 '11 at 11:50
I was referring to the ntohs and htons and their associated functions. The do not do any rearranging of the bits within each byte, only just swap the byte ordering if the host byte order is different from network byte order. – diverscuba23 Jun 30 '11 at 11:55
@diverscuba23: You don't need to care for bit-sex as the smallest unit you can use is a byte. Bit-sex would only matter in hardware. – DarkDust Jun 30 '11 at 12:02

score 0 · Answer 3 · answered Jun 30 '11 at 13:28

To add to helpful comments here - if your structs get much more complex you could be better off considering a serialization library like Boost.Serialization or Google Protocol Buffers, which handle endianness for you under the covers.

When encoding the string, make sure you send a length (probably a short handled using htons) before the string itself, don't just send 100 chars every time.

C++ Byte order in socket programming

3 Answers3

Linked

Related