2

Here is an SSCCE, showing a simplified version of my code that still does something useful:

//Compile with -O3 -Wsign-conversion

#include <arpa/inet.h>
#include <string>

void _extract_ip_port(struct sockaddr const* addr, std::string* host,unsigned short* port) {
    if (addr->sa_family == AF_INET) { //IPv4
        struct sockaddr_in const* ipv4 = reinterpret_cast<struct sockaddr_in const*>(addr);

        char temp[INET_ADDRSTRLEN];
        inet_ntop(AF_INET, &ipv4->sin_addr, temp, INET_ADDRSTRLEN);
        *host = temp;

        *port = ntohs(ipv4->sin_port); //<---- ##### WARNING HERE #####
    } else { //IPv6
        struct sockaddr_in6 const* ipv6 = reinterpret_cast<struct sockaddr_in6 const*>(addr);

        char temp[INET6_ADDRSTRLEN];
        inet_ntop(AF_INET6, &ipv6->sin6_addr, temp, INET6_ADDRSTRLEN);
        *host = temp;

        *port = ntohs(ipv6->sin6_port); //<---- ##### WARNING HERE #####
    }
}

The problem is that the calls to ntohs(...) produce something like:

<file>:<line>:<char>: warning: conversion to "unsigned int" from "int" may change the sign
of the result [-Wsign-conversion]
    *port = ntohs(ipv6->sin6_port); //<---- ##### WARNING HERE #####
            ^

As you can see from the documentation linked above, ntohs(...) has an overload that takes and returns an unsigned short. Since port, ipv4->sin_port, and ipv6->sin6_port are all that, this warning should not happen. So my question: what's going on?

The g++ version is 5.3.0-3ubuntu1~14.04, and the system is based on ARMv7. I have not been able to reproduce this on x86 or x86-64, so I'm also tagging it .

artless noise
  • 21,212
  • 6
  • 68
  • 105
geometrian
  • 14,775
  • 10
  • 56
  • 132
  • Hint: You can use the `-E` option to get the pre-processed output to make an even more concise example. Just examine the end of the `-E` output. I used this to find the ARM version of `ntohs` as in my answer. – artless noise Jan 26 '16 at 15:29

1 Answers1

1

This appears to be an issue related to the statement expressions and C++. Here is a simple program that exhibits the behaviour.

int main(void)
{
  unsigned int port =
               (__extension__ ({ unsigned short int __bsx = 
                  (unsigned short int) (0x8345u);
               ((unsigned short int)((((__bsx) >> 8) & 0xffu) | 
                   (((__bsx) & 0xffu) << 8))); }));
  return (int)port;

}

I have this named 'bar.c' and these give warnings,

gcc -O3 -Wsign-conversion-x c++ baz.c
arm-linux-gnueabi-gcc -O3 -Wsign-conversion -x c++ baz.c

These do not,

gcc -O3 -Wsign-conversion bar.c
arm-linux-gnueabi-gcc -O3 -Wsign-conversion baz.c

Further, if the statement expression is removed, then there is no warning in any language. For example,

int main(void)
{
    unsigned int port = 0x8345u;
    ((unsigned short int)(((port) >> 8) & 0xffu) | (((port) & 0xffu) << 8));
    return (int)port;
}

So my question: what's going on?

There is some issues when combining statement expressions with C++. See the documentation. Especially,

These considerations mean that it is probably a bad idea to use statement expressions of this form in header files that are designed to work with C++. (Note that some versions of the GNU C Library contained header files using statement expressions that lead to precisely this bug.)

So either compile with 'C' or accept the warning. This could either be a bug in the header or with the g++. It is not ARM CPU related; you just get a different header file which shows the issue when compiling for the ARM.

Related: Endian conversion in C++, in other words __builtin_bswap16.

Community
  • 1
  • 1
artless noise
  • 21,212
  • 6
  • 68
  • 105
  • The issue seems to be related to something with 'bitwise shift/and' and C++ stream/reference issues. If `((__bsx) >> 8)` is replaced with `(unsigned short)((__bsx) >> 8)`, then there is no warning in either language with gcc. I would just use the 'bswap' function and move on. – artless noise Jan 26 '16 at 16:20
  • I use `gcc` as 5.2.1-22ubuntu2 and `arm-linux-gnueabi-gcc` as 5.2.1-22ubuntu1. – artless noise Jan 26 '16 at 16:23
  • Ah, so the key component here is that `ntohs(...)` is actually a macro (ultimately at `netinet/in.h:401`) that calls a statement expression (ultimately at `bits/byteswap-16.h:24`).¶ I dunno--I consider this a bug in the library, despite the disclaimer. There's no reason this shouldn't work compatibly in C++. – geometrian Jan 27 '16 at 03:47
  • I agree, but do you want to edit the headers/library? I considered submitting a bug to the gcc/g++ OR glibc people. It is not entirely clear (to me) which one is at fault. However, you can just use an alternative *byte swap* in C++ or use 'C'. – artless noise Jan 28 '16 at 14:08
  • I feel like maybe the problem is gcc/g++, since the return value `__v` is (correctly) `unsigned short int`. Of course, I'm not sure that means anything; I've never used statement expressions, and (their being non-portable) I don't intend to.¶ I ended up using a function I already have in my libraries precisely for this purpose; not sure why I wasn't using it in the first place. I suppose one of the compiler intrinsics, with appropriate preprocessor guards, would work too. – geometrian Jan 29 '16 at 00:17