2

I found the following code in this example:

addr.sin_addr.s_addr = *(long *)(host->h_addr);

h_addr is is a char pointer and host is a pointer to a struct of type hostent. addr is a struct of type sockaddr_in and sin_addr is a struct of type in_addr. s_addr is a uint32.

Most of this information can be found here: http://man7.org/linux/man-pages/man7/ip.7.html

I'm pretty sure (long) casts the char to a long, but I don't know what the extra asterisks do, especially because s_addr is not a pointer.

Can someone explain what is happening here?

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
Foobar
  • 7,458
  • 16
  • 81
  • 161
  • See [this SO](https://stackoverflow.com/q/14224831/416574) post about referencing and dereferencing. The cast is making it a long pointer and then the initial asterisk is dereferencing the long pointer. – pstrjds Sep 10 '18 at 23:08

2 Answers2

8

(long *)(host->h_addr) means to interpret host->h_addr as a pointer to a long. This is not very portable, but presumably a long is 32 bits long on the system this was written for.

The additional star in *(...) dereferences what is now a long for assignment. This effectively copies all four bytes of the original char array into the single long value addr.sin_addr.s_addr. Compare to (long)(*host->h_addr), which would only copy the first char element.

This technique is extremely unportable. It assumes both the size and endianness of the long type. You might be tempted to take a hint from the fact that s_addr is a uint32 and do:

addr.sin_addr.s_addr = *(uint32_t *)(host->h_addr);

This is not really any better because the endianness is still undermined. Also,uint32_t is guaranteed to hold at least 32 bits. It can be any larger number of bits, which would invoke undefined behavior when you tried to read unallocated memory with the copy (imagine copying your 32 bits of char data as a 64-bit integer).

There are two options going forward:

If your char array is already in the correct byte order (i.e., you don't care if h_addr[0] represents the highest or lowest byte of a local uint32_t), use memcpy:

memcpy(&(addr.sin_addr.s_addr), host->h_addr, 4);

This is likely to be the approach you need. If, on the other hand you want h_addr[0] to always end up in the highest byte, you need to respect the endianness of your system:

addr.sin_addr.s_addr = (host->h_addr[0] << 24) + (host->h_addr[1] << 16) + (host->h_addr[2] << 8) + (host->h_addr[3]);

There probably need to be some casts to uint32_t along the way there.

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
  • I know this is outside of the scope of the original question - so feel free to not answer - but if this technique is not portable, then what is a better method of doing this? – Foobar Sep 10 '18 at 23:13
  • 1
    @Roymunson. Done – Mad Physicist Sep 10 '18 at 23:23
  • @Roymunson: `memcpy` is portable... But a better idea is to use the modern `getaddrinfo` instead of the legacy `gethostbyname`. The modern interface avoids this sort of nonsense completely and will deal with future protocols (e.g. IPv6) automatically – Nemo Sep 10 '18 at 23:24
  • 1
    This answer is now incorrect... The `h_addr` field is already in network byte order, which is also what the `s_addr` field needs. Your code will break on little-endian systems (e.g. x86). A simple `memcpy` is what you want and is perfectly portable. – Nemo Sep 10 '18 at 23:25
  • @Nemo. Will fix – Mad Physicist Sep 10 '18 at 23:26
  • 1
    Suggest using `htonl` macro instead of the shifts – M.M Sep 10 '18 at 23:36
  • 2
    One slight correction: You say "`uint32_t` is guaranteed to hold at least 32 bits", but it is in fact guaranteed to hold _exactly_ 32-bits. (There are `uint_least32_t` and `uint_fast32_t` for the smallest and "fastest" integer types with at least 32-bits, respectively, but `uint32_t` is fixed width.) – Arkku Sep 11 '18 at 00:02
  • @Arkku. This just started as a 2 sentence blurb about dereferencing pointers. Now I've learned at least three new things. I appreciate the comments here very much. – Mad Physicist Sep 11 '18 at 00:07
0

The way I'd read this is "addr.sin_addr.s_addr equals object of cast to long pointer of host->h_addr", or more briefly, "addr.sin_addr.s_addr equals the long pointed to by host->h_addr". host->h_addr is presumably a pointer to something other than long - here this address is treated as a pointer-to-long and the long it's pointed to is assigned to addr.sin_addr.s_addr.

HTH.

  • Do you know why `h_addr` was cast to long pointer, which was then deferenced, instead of just directly casting to a long? – Foobar Sep 10 '18 at 23:08
  • Because `h_addr` is presumably a pointer, and what's wanted is what the pointer `h_addr` points to. `h_addr` is probably a pointer-to-something-other-than-long (perhaps `void *`?), and to get the `long` at the address pointed to by `h_addr`, `h_addr` must be cast to `long *` and then dereferenced. – Bob Jarvis - Слава Україні Sep 10 '18 at 23:09