how to know the data representation when sending a struct to another process

Question

Assuming I want to transmit data from one process (1) to another (2).

struct test {
   uint8_t a;
   uint8_t b;
}

this structure is sent via a buffer. I use a function of the second process to transmit the data.

write(unsigned char* buffer, int size)

in the main I do

int main(){
   test sendData;
   unsigned char buffer[2]
   sendData.a = 10;
   sendData.b = 20;
   std::memcpy(buffer,&sendData,sizeof(test));
   _device.write(buffer, sizeof(buffer));// assume that _device is my second process instance
}

how can I know which data representation to use to comunicate with the second process ?

is it a big endian or a little endian ? or does it depend on the processor architecture ? how can I know that ?

There is no general solution. You should know everything about the second process, its target architecture, what the transmitting device expects... In particular, if the second process is your app as well, you already know most of the stuff. If it's someone else's, you have to read docs or reverse-engineer. — yeputons, Nov 24 '21 at 18:14
@yeputons and if I want to know how it is implemented in my side. I debug and I see how the data is put in the buffer ? — anes47, Nov 24 '21 at 18:23
If your data gets more complex than in this example you can look into something like [flatbuffers](https://google.github.io/flatbuffers/) — tromgy, Nov 24 '21 at 18:44
I think you'll find this useful: https://stackoverflow.com/q/69983795/4561887. I've posted serialization strategies, including one which avoids endianness concerns altogether because it is endianness agnostic. This is useful when communicating between separate computers of different endianness. But, if just separate processes within the same computer, endianness will be the same for all processes. — Gabriel Staples, Nov 27 '21 at 08:49
If we talk about "processes" we are on the same machine. As this it can be assumed, that data layout will be the same if all executable are generated with same compiler and compiler settings. In general it is a good idea to use one of the serializer libraries. That makes it much easier to maintain the code and run cross platform/via network or directly writing to database or files. — Klaus, Nov 27 '21 at 08:51
@GabrielStaples We have written our own which can connect to files/network/databases and enables format changes like xml, plain text or binary and also creates gui dialogs on the fly for rapid prototyping. But boost is a good first choice. — Klaus, Nov 27 '21 at 09:00

score 1 · Answer 1 · answered Nov 24 '21 at 19:34

Do you already have _device.write() working in any fashion? That by itself can be tricky and involves either a starting process setting up input and output files or doing socket programming. That's a bigger question than you've asked. You've basically asked about binary representation.

If your two processes are running on the same computer from the same source code definition of your structure, then a pretty simple write of the raw data is safe.

But if your processes are running across a network, possibly on different types of hardware, you absolutely can't rely on raw writes, because you don't know whether your endian-ness matches.

So you have two basic choices. You can serialize the data into an external representation. Unless you're sending a lot of data, this is what most people probably do. For instance, you can turn this into a JSON string and send that. This is what a LOT of HTTP REST calls do.

The other choice is to use the network byte order calls. You wouldn't read or write the raw object. You'd write a method that reads or writs it, and it would use calls like htons, htonl, etc.

Personally, I haven't had to do that in a few decades. Networks are fast, and most data is pretty small. You can quite likely use JSON instead.

Unless you're writing an MMORPG.

the second process has a driver implemented in my machine with .h whose write() function is declared after the proxy runs in another machine. the problem is that the second process must know how the data is represented in my process so that it knows how to proceed (swap if necessary) my question is how can i know how the data is represented in my process. do i debug and look at the buffer content how it is represented? — anes47, Nov 24 '21 at 23:39

eberhard · Answer 2 · 2021-11-27T08:45:03.773

Communications over the network have different layers. In your example 2 layers are missing. I cannot see which transport layer you are using, but I assume a TCP socket. A TCP socket is a streaming transport layer. This means, when you send multiple messages to the destination, it will not know where a single message starts or ends. The read function will give you not complete messages.

So, your first layer that is missing is a protocol, which keeps track of complete messages. You can do it with a header which contains the message length or with a delimiter character at the end of each message. For knowing the type of the message at destination, you can give the name of your message in the protocol. The message name could be encoded for example in the protocol header.

Your second layer that is missing is the data encoding, which solves the problem of having different plattforms (byte order, data alignment). Here, you can use json or protobuf.

There are frameworks available, which can solve your problem. One of it is called finalmq: https://github.com/bexoft/finalmq

In the readme, it explains also the different layers. On each layer you can choose different alternatives. For example:

json with line feed delimiter
protobuf with message header
json with http (you can access your server with a browser)
protobuf with MQTT (for IoT applications).

how to know the data representation when sending a struct to another process

2 Answers2