Why do Berkeley sockets require byte swapping?

Question

I understand that on the wire, most integers are in big endian format.

But why is it the burden of the application to do the byte swapping in structures like sockaddr_in and not the kernels, where all the low level work actually happens? It would make more sense if the userspace API was more platform agnostic and should not deal with this.

Why was the Berkeley socket API designed like this?

Are you talking about filling in `sockaddr_in` structures by hand and similar? Or are you talking about protocols that exist at a higher level than TCP, and having to convert the data that appears in buffers filled by `recv`? — zwol, Jun 01 '19 at 15:15
Thanks. I don't know the answer to this question myself but I think it's a valid question and I hope someone does know. — zwol, Jun 01 '19 at 15:17
(A technicality you might want to correct: integers in IP, TCP, and UDP headers are in big-endian format, but some other protocols do use integers in little-endian format.) — zwol, Jun 01 '19 at 15:18
If you're going to ask about every design deficiency of old APIs, you'll be busy for a long time. Yes, the socket API is awful, much like a lot of the Unix/Posix APIs. — EOF, Jun 01 '19 at 15:19
It's not really the same question. This question is asking about the design decision. The one you pointed to, @Antti Haapala, was asking about the technical necessity, or how things could work that way, rather than why it was designed thusly. — Swiss Frank, Jun 01 '19 at 17:05
Machines used in the early 1980s that were capable of booting Unix were all big-endian machines. — Hans Passant, Jun 01 '19 at 19:29
`ntohs` etc. have been around almost as long as sockets, so it hasn't been much of a burden to code authors. Besides, many people writing on sockets in the early days were developing protocols, so expected low-level access to bits and bytes. — stark, Jun 01 '19 at 21:01

Basile Starynkevitch · Accepted Answer · 2019-06-01T15:29:22.313

1

The reason is probably historical.

The socket API was invented (in the 1980s) when Sun-3 (MC68030) and Sun-4 (Sparc) workstations were kings. The endianness of these (slow by today's standards) processors mattered.

I forgot the details, and probably BSD sockets conventions have been invented for some PDP-11 or VAX-780.

But why is it the burden of the application to do the byte swapping in structures like sockaddr_in and not the kernels

Probably because in the 1980s you did not want the computer (a thousand times slower than your mobile phone) to spend too much (uninterruptible) time in kernel-land.

That question should really be asked on https://retrocomputing.stackexchange.com/ (and its answer lies inside the source code of some 1980s era Unix kernel)

edited Jun 01 '19 at 15:29

answered Jun 01 '19 at 15:18

Basile Starynkevitch

223,805
18
296
547

This *still* doesn't explain why this low-level implementation detail was exposed to userspace, and then to users of the socket library. – EOF Jun 01 '19 at 15:23
The question was a duplicate – Antti Haapala -- Слава Україні Jun 01 '19 at 16:22
Antti, not a duplicate. EOF, I explained it in my answer. – Swiss Frank Jun 01 '19 at 17:06

score 1 · Answer 2 · answered Jun 01 '19 at 15:32

The only technical advantage I can think of is that it allows the application to do the conversion once and cache it.

Then, for myriad calls to say sendto() for UDP, or what have you, the re-ordered-if-necessary address is supplied to the OS which can copy it as-is directly into outgoing network packets.

The alternative of doing it in the kernel would require every call to sendto() to take what the app knows as the same address over and over, and reconvert it every time.

Since sendto() benefits from this, they have the rest of the API work the same way.

The question was a duplicate – Antti Haapala -- Слава Україні Jun 01 '19 at 16:22 — Antti Haapala -- Слава Україні, Jun 01 '19 at 16:22

Why do Berkeley sockets require byte swapping?

2 Answers2