2

Suppose I have two independent Linux processes, one of which must update hashtable in another. I need to send two C structures at a time (to avoid transactionality issues):

struct key
{
        uint32_t id;
};

struct value
{
        uint8_t arr[16];
        uint32_t var;
        uint32_t var2;
};

Is there any best-practise way of implementing this?

I see two main options (both with iovec):

  1. Send two structs through 2 iovec elements:

    struct key key;
    key.id= 123;
    struct value value;
    memset(value.arr, 0, sizeof(value.arr));
    value.var= 456;
    value.var2= 101;
    
    struct iovec iov[2];
    iov[0].iov_base = &key;
    iov[0].iov_len = sizeof(key);
    iov[1].iov_base = &value;
    iov[1].iov_len = sizeof(value);
    struct msghdr msg;
    ...
    if (sendmsg(sockfd, &msg, 0) < 0) {
    ...
    

    This method has advantage that API (key-value) is strictly regulated with common header file, where struct key and struct value are declarated.
    The disadvantage is that this method requires struct packing (pragma pack) because we can't rely on the fact that padding/gaps will be the same in these two independent binaries (only the struct members order is guaranteed by C standard). Packing can decrease performance because of unaligned access so it will produce two struct definitions - one for transport (packet), another for internal usage.

  2. Send every struct's element as iovec element

    struct key key;
    key.id= 123;
    struct value value;
    memset(value.arr, 0, sizeof(value.arr));
    value.var= 456;
    
    struct iovec iov[4];
    iov[0].iov_base = &key.id;
    iov[0].iov_len = sizeof(key.id);
    iov[1].iov_base = &value.arr;
    iov[1].iov_len = sizeof(value.arr);
    iov[2].iov_base = &value.var;
    iov[2].iov_len = sizeof(value.var);
    iov[3].iov_base = &value.var2;
    iov[3].iov_len = sizeof(value.var2);
    struct msghdr msg;
    ...
    if (sendmsg(sockfd, &msg, 0) < 0) {
    ...
    

    Advantage - no struct packing needed.
    Disadvantage - the API is greatly weakened.

Which way is the worst? Is there any other options?

NK-cell
  • 1,145
  • 6
  • 19
  • 2
    Just because you pack/unpack when you send/receive you don't have to suffer alignment penalties. You can keep it aligned in memory and just pack it when you send it. It requires copying the data into an intermediate buffer though. – Ted Lyngmo Jul 10 '23 at 13:25
  • 1
    Dont forget about byte order. Use ntoh and hton funtions. – wcochran Jul 10 '23 at 13:30
  • @wcochran Does byte order the case with unix domain sockets?? O_o – NK-cell Jul 10 '23 at 13:35
  • @TedLyngmo Yep, I meant intermediate copying. It provokes unaligned access, and if this hashtable is big enough it can play a role imho.. especially in highload applications – NK-cell Jul 10 '23 at 13:39
  • If you have your struct perfectly aligned and just put the fields in an intermediate buffer when you want to send the struct, there won't be any unaligned access. You reverse the process on the other side of course. – Ted Lyngmo Jul 10 '23 at 13:42
  • @TedLyngmo How this can be *perfectly* after packing? – NK-cell Jul 10 '23 at 13:43
  • Because they are in a plain `char[]`. Only the object represantations are there and you will never access those objects though a misaligned pointer. – Ted Lyngmo Jul 10 '23 at 13:44
  • @TedLyngmo You mean to serialize uint32 to char array? Looks like a kind of hack.. But maybe I understood wrong – NK-cell Jul 10 '23 at 13:46
  • No, that's exactly what I meant [example without endianess-fixes](https://godbolt.org/z/34PM4E4nE. It's not a hack. It's how serializing is usually done. It's very similar to storing the data in binary files. – Ted Lyngmo Jul 10 '23 at 13:48
  • You should be using network byte order even w unix domain sockets. – wcochran Jul 10 '23 at 13:57
  • 3
    writing/reading on unix domain socket is for ipc on same host, so there is no need for network byte order. – ulix Jul 10 '23 at 14:10
  • Doesn't the 3rd parameter of sendmsg need to be the size of the message? – cup Jul 10 '23 at 14:18
  • @cup 3rd parameter of sendmsg() contains flags. – Ian Abbott Jul 10 '23 at 14:26
  • 1
    struct member alignment padding (without pragma pack) should be consistent across different binaries on the same arch and is defined in the ABI. I would not expect your `struct key` and `struct value` to contain any padding bytes whatsoever. – Ian Abbott Jul 10 '23 at 14:45
  • @IanAbbott Are you sure **struct member alignment padding** is defined by the ABI? How can I be sure? Where to read? – NK-cell Aug 08 '23 at 17:59
  • @NK-cell For example, the System V ABI AMD64 Architecture Processor Supplement, Draft 1.0, section 3.1.2 has a table of fundamental alignments for scalar types in Figure 3.1, and the "Aggregates and Unions" subsection states: *"Structures and unions assume the alignment of their most strictly aligned component. Each member is assigned to the lowest available offset with the appropriate alignment."* – Ian Abbott Aug 09 '23 at 09:32
  • 1
    @NK-cell System V ABI Intel386 Architecture Processor Supplement version 1.1 has similar info in section 2.1.1, Table 2.1., and the "Structres and Unions" subsection. – Ian Abbott Aug 09 '23 at 09:39

1 Answers1

0

is there any best-practise way of implementing this?

I'd use an auxiliary struct to include both, so content is properly aligned and both structures can be read from the socket in one shot.

struct pair {
    struct key   the_key;
    struct value the_value;
};

and use this structure when sending/receiving packets. TCP sockets (and unix sockets too) conserve data so boundaries can be established without having to do synchronization apart from a connection loss. Aligned shouldn't be an issue this way, as both structures have an alignment requirement to an uint32_t and so no gaps are expected between both structures.

Luis Colorado
  • 10,974
  • 1
  • 16
  • 31