Why do we cast sockaddr_in to sockaddr when calling bind()?

Question

The bind() function accepts a pointer to a sockaddr, but in all examples I've seen, a sockaddr_in structure is used instead, and is cast to sockaddr:

struct sockaddr_in name;
...;
if (bind (sock, (struct sockaddr *) &name, sizeof (name)) < 0) {
  ...;
}

I can't wrap my head around why is a sockaddr_in struct used. Why not just prepare and pass a sockaddr?

Is it just convention?

NB. `sockaddr_in6` also exists, and anyone writing new code would be wise to include it… — BRPocock, Jan 13 '14 at 19:00
You have omitted one very important part in your code: `name.sa_family = AF_INET` for `struct sockaddr_in`! Consider `struct sockaddr` to be a union of all other sockaddr types. The only thing in common is that they have a first member `sa_family_t sa_family` which must correspond to the actual structure type. — Ulrich Eckhardt, Jan 25 '15 at 10:47

score 78 · Accepted Answer · edited Jan 25 '15 at 09:03

78

No, it's not just convention.

sockaddr is a generic descriptor for any kind of socket operation, whereas sockaddr_in is a struct specific to IP-based communication (IIRC, "in" stands for "InterNet"). As far as I know, this is a kind of "polymorphism" : the bind() function pretends to take a struct sockaddr *, but in fact, it will assume that the appropriate type of structure is passed in; i. e. one that corresponds to the type of socket you give it as the first argument.

edited Jan 25 '15 at 09:03

sashoalm

75,001
122
434
781

answered Jan 13 '14 at 18:59

10

Just to add: `sockaddr_in6` is for IPv6 addresses, `sockaddr_un` for Unix domain sockets, ... – Martin R Jan 13 '14 at 19:00
@MartinR I was also thinking about Bluetooth (if I'm not mistaken, Linux does RFCOMM over sockets), etc. – Jan 13 '14 at 19:01
There are a lot of socket types that I do not know ... It might be worth to mention `struct sockaddr_storage`, which is also "generic" in some sense and is large enough to hold any type of socket address. – Martin R Jan 13 '14 at 19:06
To put it in layman's terms, if a function accepts `Cat` then we can't use it with `Dog`, but if the function accepts `Animal`, then we can put any `Animal` into the function, such as `Cat`, `Dog`, `Tiger`, etc. (Well, yeah C is not OOP language but you get the point :) – starriet Apr 25 '23 at 15:16

Mihir Luthra · Answer 2 · 2019-08-09T15:21:31.950

I don't know if its very much relevant for this question, but I would like to provide some extra info which may make the typecaste more understandable as many people who haven't spent much time with C get confused seeing such a typecaste.

I use macOS, so I am taking examples based on header files from my system.

struct sockaddr is defined as follows:

struct sockaddr {
    __uint8_t       sa_len;         /* total length */
    sa_family_t     sa_family;      /* [XSI] address family */
    char            sa_data[14];    /* [XSI] addr value (actually larger) */
};

struct sockaddr_in is defined as follows:

struct sockaddr_in {
    __uint8_t       sin_len;
    sa_family_t     sin_family;
    in_port_t       sin_port;
    struct  in_addr sin_addr;
    char            sin_zero[8];
};

Starting from the very basics, a pointer just contains an address. So struct sockaddr * and struct sockaddr_in * are pretty much the same. They both just store an address. Only relevant difference is how compiler treats their objects.

So when you say (struct sockaddr *) &name, you are just tricking the compiler and telling it that this address points to a struct sockaddr type.

So let's say the pointer is pointing to a location 1000. If the struct sockaddr * stores this address, it will consider memory from 1000 to sizeof(struct sockaddr) possessing the members as per the structure definition. If struct sockaddr_in * stores the same address it will consider memory from 1000 to sizeof(struct sockaddr_in).

When you typecasted that pointer, it will consider the same sequence of bytes upto sizeof(struct sockaddr).

struct sockaddr *a = &name; // consider &name = 1000

Now if I access a->sa_len, the compiler would access from location 1000 to sizeof(__uint8_t) which is same bytes size as in case of sockaddr_in. So this should access the same sequence of bytes.

Same pattern is for sa_family.

After that there is a 14 byte character array in struct sockaddr which stores data from in_port_t sin_port (typedef'd 16 bit unsigned integer = 2 bytes ) , struct in_addr sin_addr (simply a 32 bit ipv4 address = 4 bytes) and char sin_zero[8](8 bytes). These 3 add up to make 14 bytes.

Now these three are stored in this 14 bytes character array and we can access any of these three by accessing appropriate indices and typecasting them again.

user529758's answer already explains the reason to do this.

What troubles me is I can understand either a `void*` + size OR a "weak" type that that you cast your specific types with the call assuming a common size but I have a hard time wrapping my head around both... — cassepipe, Oct 11 '22 at 13:49
The key part: _"After that there is a 14 byte character array in `struct sockaddr` ... These 3 add up to make 14 bytes. ... Now these three are stored in this 14 bytes character array ..."_ — starriet, Apr 25 '23 at 15:24

Magnus Reftel · Answer 3 · 2018-05-28T06:40:03.307

6

This is because bind can bind other types of sockets than IP sockets, for instance Unix domain sockets, which have sockaddr_un as their type. The address for an AF_INET socket has the host and port as their address, whereas an AF_UNIX socket has a filesystem path.

edited May 28 '18 at 06:40

answered Jan 13 '14 at 19:00

Magnus Reftel

967
6
19

Why do we cast sockaddr_in to sockaddr when calling bind()?

3 Answers3

Linked

Related