4

Warning/Disclaimer:

This question contains heresay, but I could not find the answers to the claims stated below, in my little research done in the last half an hour or so. I am just curious if someone here already knows about this.

This question has no code. Just technical queries.

Background:

I have a legacy application, which uses C-style structs passed between processes for interprocess communication. And this works quite well and has been working for many many years, long before even I was on this planet :P.

I was supposed to write a new process that would become part of this application. Unwittingly, I wrote it in C++, assuming whaterver IPC, we are using could handle this. Unfortunately, then I found out (from colleagues) that the existing infrastructure can only pass C-style structs.

'Unverified' claims/statements:

In addition,one of the colleagues listed the following reasons why C++ was a bad choice in this case.

  1. C++ objects have vtables. C-style structs are just variables and values. Therefore C-style structs can be passed around processes, while C++ objects cannot be.

  2. With C-style structs we can embed information like size of the struct, so that both sides know what to expect and what to send, but for C++ objects this is not possible since 'the size of the vtable could vary'.

  3. 'If we change compilers, then it is even worse. We would have even more permutations to deal with for the case of C++ objects.'

Investigating the claims:

Needless to say, this colleague has a little bias for C, but he is much more experienced than me and probably knows what he is talking about. I am language-agnostic. But this immediately got me thinking. How can it be that we cannot do interprocess communication with C++? I googled and the first hits were invariably from stackoverflow, like this one:

Inter-Process Communication Recommendation

And I looked up on the different methods of IPC listed here. https://en.wikipedia.org/wiki/Inter-process_communication#Approaches

I mean, I followed up on each of those methods in the list like pipes or shared memory, etc and the only caveat that everyone keeps on pointing out, is, that pointers (duh! of course) cannot be passed like this and some issues with synchronization could creep up - je nachdem.

BUT nowhere could I find something that could refute or corroborate his 'claims'. (Of course, I could continue on digging for the rest of the day. :P)

Questions:

  1. Are his three claims really so or was it just FUD? Considering that, all that I have in those objects that I wanted wanted to pass around are also only, POD variables and some STL containers like std::vector and std::pair and their values (no pointers or anything), and the getters for those variables. There are no virtual functions except the virtual destructor, which exists since I inherited all the messages from one base message class, since at that time, I was thinking that there might be some common base functionality. (I could get rid of this base class quite easily now, since there is nothing really common there till now! Thankfully for some reason I kept the parsing and formatting of messages in a separate class. Luck or foresight? :D )

  2. It also actually makes me wonder, how does the compiler know when a struct is a C-style struct since we are using the g++ compiler for the whole project anyways? Is it the use of the 'virtual' keyword?

  3. I am not asking for a solution for my case. I can wrap the results from those objects into structs and pass them on through the IPC or I could get rid of base class and virtual destructor as stated in 'my' point 1 above.

Boost or any C++11 stuff or any library that handles this is not desired. Any suggestions in this regard are tangential to the question at hand.

(p.s. Now that I posted and re-read what I posted, I want to nip the thought in the bud that might be creeping up in any reader's head who reads this, that ... I am asking this for my knowledge, and not for arguing with that colleague. Skepticism is good, but would be nice for the community if we all assumed others to have good intentions.:) )

Duck Dodgers
  • 3,409
  • 8
  • 29
  • 43
  • 1
    Even if you don't use Boost, you can look at the documentation for Boost.Interprocess to see what's possible, what isn't, and what gotcha's and workarounds there are. – Quentin Dec 20 '18 at 14:46

3 Answers3

5

only caveat that everyone keeps on pointing out, is, that pointers (duh! of course) cannot be passed like this

Pointer values (as well as other references to memory and resources) are indeed meaningless across processes. This is obviously a consequence of virtual memory.

Another caveat is while C standard specifies exact (platform specific) memory layout for structs, C++ standard doesn't guarantee a particular memory layout for classes in general. One process doesn't necessarily agree with another process on the amount of padding between members for example - even within the same system. C++ only guarantees memory layout for standard layout types - and this guaranteed layout matches with C structs.


... and some STL containers like std::vector ... (no pointers or anything)

All standard containers except std::array use pointers internally. They have to because their size is dynamic, and therefore must allocate the data structures dynamically. Also, none of those are standard layout classes. Furthermore, the class definitions of one standard library implementation are not guaranteed to match another implementation and two processes can use different standard libraries - this is not at all uncommon on Linux where some processes might use libstdc++ (from GNU) while others might use libc++ (from Clang).

There are no virtual functions except the virtual destructor

In other words: There is at least one virtual function (the destructor), and therefore there is a pointer to a vtable. And also no guaranteed memory layout because classes with virtual functions are never standard layout classes.


So to answer the questions:

  1. Mostly no FUD, although some claims are technically a bit inaccurate:

    1. C++ objects may have vtables; not all of them do. C structures can have pointers, so not all C structures can be shared either. Some C++ objects can be shared accross processes. Specifically, standard layout classes can be shared (assuming there are no pointers).
    2. Objects with vtables cannot be shared indeed.
    3. Standard layout classes have a guaranteed memory layout. Changing the compiler is not a problem as long as you restrict yourself to standard layout classes. Trying to share other classes might appear work if you're unlucky, but you'll probably face problems when you start mixing compilers.
  2. The C++ stadard defines the exact conditions where a class is standard layout. All C struct definitions are standard layout classes in C++. The compiler knows those rules.

  3. This is not a question.


Conclusion: You can use C++ for IPC, but you're limited to standard layout classes in that interface. This excludes you from many C++ features such as virtual functions, access specifiers etc. But not all: You can still have member functions for example.

Do note however, that using C++ features may cause the inter process interface to only work with C++. Many languages can interface with C, but hardly any can interface with C++.

Furthermore: If your "interprocess" communication goes beyond the boundaries of the system - across network that is - even a C structure or standard layout class is not a good representation. In that case you need serialisation.

eerorika
  • 232,697
  • 12
  • 197
  • 326
2
  1. Are his three claims really so or was it just FUD? Considering that, all that I have in those objects that I wanted wanted to pass around are also only, POD variables and some STL containers like std::vector and std::pair and their values (no pointers or anything), and the getters for those variables. There are no virtual functions except the virtual destructor, which exists since I inherited all the messages from one base message class, since at that time, I was thinking that there might be some common base functionality. (I could get rid of this base class quite easily now, since there is nothing really common there till now! Thankfully for some reason I kept the parsing and formatting of messages in a separate class. Luck or foresight? :D )

No, as soon as stl containers are in your structure, you cannot pass them around like POD data. The implementation of stl containers is not specified and they may (and in most cases do) contain pointers for internal purposes.

  1. It also actually makes me wonder, how does the compiler know when a struct is a C-style struct since we are using the g++ compiler for the whole project anyways? Is it the use of the 'virtual' keyword?

As long as your struct/class has only POD data and no virtual functions it will be stored as POD, but alignement differences may be an issue if the other side of your IPC has been compiled with another compiler and/or different compiler settings or different alignment directives (such as #pragma pack etc.).

  1. I am not asking for a solution for my case. I can wrap the results from those objects into structs and pass them on through the IPC or I could get rid of base class and virtual destructor as stated in 'my' point 1 above.

Wrapping the results from those objects into structs and passing them on through the IPC sounds good to me, personnally that's what I'd do. The other solution isn't bad either, it's hard to tell which one is better without context.

Jabberwocky
  • 48,281
  • 17
  • 65
  • 115
  • Ah! ok thanks. STL Containers! That thought did not cross my mind that internally they might be using pointers. – Duck Dodgers Dec 20 '18 at 10:29
  • 1
    @JoeyMallone `std::vector` definitely uses pointers (almost "by definition"). – Jabberwocky Dec 20 '18 at 10:30
  • 2
    @JoeyMallone Have a look at [Standard Layout](https://en.cppreference.com/w/cpp/named_req/StandardLayoutType) (and read "other programming languages" as "C or things that talk C") – Caleth Dec 20 '18 at 10:56
  • 1
    As a complement, a virtual destructor is enough for the struct not be a POD, even if no C++ library containers are involved. – Serge Ballesta Dec 20 '18 at 11:06
  • @SergeBallesta this was included in my answer to question 2. – Jabberwocky Dec 20 '18 at 11:07
  • 1
    @Jabberwocky: I was insisting on it because OP wrote *There are no virtual functions except the virtual destructor*. Maybe my comment would be better on his question than on your answer... – Serge Ballesta Dec 20 '18 at 11:11
  • @SergeBallesta, because the vtable comes into being as soon as the virtual destructor is there. And existence of vtable means no POD. yes? – Duck Dodgers Dec 20 '18 at 11:53
  • 1
    @JoeyMallone presence of a virtual destructor => presence of virtual functions => Not POD, therefore presence of a virtual destructor => not POD. qed – Jabberwocky Dec 20 '18 at 12:16
  • @Caleth, thank you. I did not know about the standard layout. – Duck Dodgers Dec 20 '18 at 13:32
0

'Unverified' claims/statements:

C++ objects have vtables. C-style structs are just variables and values. Therefore C-style structs can be passed around processes, while C++ objects cannot be.

This statement is partly true, but being cast in a misleading way.

Not all C++ objects have vtables (Technically the C++ standards don't require vtables at all, although it is a common implementation technique used to support virtual function dispatch, because it offers various advantages).

If you look up this SO question and various answers you will find a discussion of aggregate and POD types in C++. The catch is that the definitions evolved between C++ standards (as reflected in various answers to that question). In C++11, the notion of POD types was changed, and effectively replaced with concepts of trivial and standard-layout types.

POD types (before C++11) and standard-layout types (C++11 and later) can be interchanged between C++ and C (i.e. passed from code written in one language to code written in the other, since the memory layout is compatible).

It is true that C++ objects with any virtual functions are among those that cannot be interchanged with C. Pointers can't generally be copied, which prevents use of (most of) the C++ standard containers.

With C-style structs we can embed information like size of the struct, so that both sides know what to expect and what to send, but for C++ objects this is not possible since 'the size of the vtable could vary'.

This statement is false, since there are types that do not have a vtable, and types that can be interchanged between C++ and C.

If we change compilers, then it is even worse. We would have even more permutations to deal with for the case of C++ objects.

Again, as long as the types are chosen correctly in C++ code, it is possible to interchange them with C.

This statement - where it is true for C interoperating with C++ - is also true for C. The size of C types is formally implementation defined in C, just as it is in C++. Types like int, long, float, double are not guaranteed to have the same size with different compilers. There are compilers with settings that change the size of some or all basic types (e.g. compilers that have different floating point options, compilers that have settings affecting whether an int is 16 or 32 bit, etc).

struct types in C may also have padding between members, and the padding can vary between C compilers. A number of compilers have compilation option that affect padding, which affects the size of struct types. This can introduce incompatibilities of layout for the same struct type in C.

So what is probably going on here?

The interprocess communication was probably designed with the assumption that it would always be between C code, built with the same (or compatible) compilers. The IPC mechanism is probably quite simple: for example, one process squirting a certain amount of data at a specified memory location down a pipe, and the receiver copying the data received at the other end of that pipe into an equivalent data structure.

The implicit assumption is that the data can be directly copied that way. This relies on the layout of data types being compatible in both programs.

The problem is that, since the IPC mechanism was designed with an assumption of compatible C compilers, you are now being told that this is because of advantages of C over C++ (or other languages). It's not. It is an artefact of how the IPC is being done.

The IPC approach is probably quite limited but it will be possible for your C++ code to send and receive data via the IPC mechanism, as long as you pack the data in appropriate types (e.g. standard-layout) in the C++ code. And it won't matter if the other process was written in C or C++. It might take more work in C++ (e.g. pack data from a C++ class into a standard-layout structure and squirt that structure to the other process - or the reverse if receiving data) but that is certainly possible.

You will need to use compatible compilers, regardless.

And this assumes you can't change the means of interprocess communications (e.g. design a protocol for talking between processes, rather than blindly copying data from a memory location down the line to the other process, and the receiving process then copies the data back into a compatible data structure). There are ways of doing IPC, that would better support a range of programming languages, if needed - albeit with different trade-offs (e.g. bandwidth for communication, code to translate data so it can be sent, and code to receive data and turn it back into data structures).

Peter
  • 35,646
  • 4
  • 32
  • 74