1

I need to use htons() in my code to convert the little-endian ordered short to a network (big-endian) byte order. I have this code:

int PacketInHandshake::serialize(SOCKET connectSocket, BYTE* outBuffer, ULONG outBufferLength) {
    memset(outBuffer, 0, outBufferLength);
    const int sizeOfShort = sizeof(u_short);
    u_short userNameLength = (u_short)strlen(userName);
    u_short osVersionLength = (u_short)strlen(osVersion);
    int dataLength = 1 + (sizeOfShort * 2) + userNameLength + osVersionLength;
    outBuffer[0] = id;
    outBuffer[1] = htons(userNameLength);// htons() here
    printf("u_short byte 1: %c%c%c%c%c%c%c%c\n", BYTE_TO_BINARY(outBuffer[1]));
    printf("u_short byte 2: %c%c%c%c%c%c%c%c\n", BYTE_TO_BINARY(outBuffer[2]));
    for (int i = 0; i < userNameLength; i++) {
        outBuffer[1 + sizeOfShort + i] = userName[i];
    }
    outBuffer[1 + sizeOfShort + userNameLength] = htons(osVersionLength);// and here
    for (int i = 0; i < osVersionLength; i++) {
        outBuffer[1 + (sizeOfShort * 2) + userNameLength + i] = osVersion[i];
    }
    int result;
    result = send(connectSocket, (char*)outBuffer, dataLength, 0);
    if (result == SOCKET_ERROR) {
        printf("send failed with error: %d\n", WSAGetLastError());
    }
    printf("PacketInHandshake sent: %ld bytes\n", result);
    return result;
}

Which results in a packet like this to be sent:

enter image description here

As you see, the length indication bytes where htons() is used are all zeros, where they should be 00 07 and 00 16 respectively.

And this is the console output:

u_short byte 1: 00000000
u_short byte 2: 00000000
PacketInHandshake sent: 34 bytes

If I remove the htons() and just put the u_shorts in the buffer as they are, everything is as expected, little-endian ordered:

enter image description here

u_short byte 1: 00000111
u_short byte 2: 00000000
PacketInHandshake sent: 34 bytes

So what am I doing wrong?

AnB
  • 119
  • 3
  • 10
  • 7
    `outBuffer[1] = htons(userNameLength);` looks like you're writing 16 bits into an 8 bit slot. – user4581301 Feb 02 '22 at 01:26
  • I expect that outbuffer[1]=htons is causing a compiler warning as you are tryin to write a short to a byte. You need to write each byte separately or cast outBuffer to (short*) – pm100 Feb 02 '22 at 01:30
  • @AnB please reconsider the answer that you accepted. As it happens, the author would like to delete it due to being wrong/dangerous, but cannot do so since it is accepted. – TylerH Feb 04 '22 at 15:11

1 Answers1

1

Converting endianess of a 16 bit number and storing it in a byte array is trivial, there is no need for library functions. Assuming 32 bit CPU:

uint16_t u16 = ...;
uint8_t out[2];

out[0] = ((uint32_t)u16 >> 8) & 0xFFu;
out[1] = ((uint32_t)u16 >> 0) & 0xFFu;

The casts and u suffix are there as a good habits to block implicit promotion to int which is problematic in some cases, since it's a signed number.

Since shifts don't care about the underlying endianess, the above code works for both big-to-little and little-to-big conversions, as long as you go from one to the other.

This scales to 32 bit types as:

uint32_t u32 = ...;
uint8_t out[4];

out [0] = ((uint32_t)u32 >> 24) & 0xFFu;
out [1] = ((uint32_t)u32 >> 16) & 0xFFu;
out [2] = ((uint32_t)u32 >>  8) & 0xFFu;
out [3] = ((uint32_t)u32 >>  0) & 0xFFu;
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • 1
    There is also now [`std::byteswap`](https://en.cppreference.com/w/cpp/numeric/byteswap) coming in c++23 – Mgetz Feb 02 '22 at 16:42
  • @Mgetz Why. Any beginner ought to know how to do this without pointless library function. The first thing to learn _before_ writing a single line of source code in any language is binary and hex, as well as binary arithmetic. Then you just need to learn the C syntax for it. The only thing which is somewhat advanced here is the various implicit promotions, as well as what will happen if you shift signed/negative numbers. – Lundin Feb 03 '22 at 07:23
  • (The htons etc functions are quite useless since network endianess can't be known by a system or compiler. Most data communication uses Big Endian but you can't assume that. There's for example industry standards like CANopen where Big Endian is used on the data link layer, but Little Endian on the application layer. Mission impossible for dummies who only know how to call some "htons" function.) – Lundin Feb 03 '22 at 07:25
  • Why? Because it's part of the *standard* and should be mentioned. Just because you can write code for something doesn't mean you *should* because the more code you write the more you have potential for bugs. This method was included by the committee after debate for very good reason. Because A) it allows compilers to insert optimized byte swap instructions more obviously, B) it works on all widths of integers, C) it doesn't require someone to come in behind and understand what all the shifts are for, D) it doesn't require a compiler to divine your intent for a byte swap in a non-portable way. – Mgetz Feb 03 '22 at 12:34
  • Also your second comment makes zero sense... Network endianness is determined by the [IETF](https://datatracker.ietf.org/doc/html/draft-newman-network-byte-order-01) and while you could argue it's not settled it's pretty well a standard because almost all RFCs refer to big endian as "Network byte order" so it's de-facto standard unless the protocol states explicitly otherwise. To the point it's referred to as such *in actual standards*. – Mgetz Feb 03 '22 at 12:43
  • @Mgetz There are other communication interfaces than the Internet protocol suite. Not every computer in the world is a PC. – Lundin Feb 03 '22 at 12:46
  • And this question isn't referring to any of them. It is referring to Sockets – Mgetz Feb 03 '22 at 12:46