Is it safe to use strings as private data members in a class used across a DLL boundry?

Question

My understanding is that exposing functions that take or return stl containers (such as std::string) across DLL boundaries can cause problems due to differences in STL implementations of those containers in the 2 binaries. But is it safe to export a class like:

class Customer
{
public:
  wchar_t * getName() const;

private:
  wstring mName;
};

Without some sort of hack, mName is not going to be usable by the executable, so it won't be able to execute methods on mName, nor construct/destruct this object.

My gut feeling is "don't do this, it's unsafe", but I can't figure out a good reason.

I thought the point of standard library components was to be standard. Or is there an implicit "for a specific compiler" in there? — JAB, May 29 '13 at 18:29
How do you create instances of the class? If you use a factory (implemented in the dll), you should be fine. Otherwise, if you create them with `new` or on stack, and `wstring` happens to be of a different size, you will run into issues. — riv, May 29 '13 at 18:30
@JAB some things are standard, others aren't. If the STL defines a class `address` that has members `ip` and `port`, it may not specify that `ip` comes before `port` in memory, nor does it necescarily say there can't be other non-spec defined members (i.e., `ip_version`). So when passing a fully constructed object between binaries, one may have laid out the object as [ip][port] in memory, whereas the other may expect it to be [port][ip] in memory - they have no way to tell each other about the difference. @riv, good point about `new` (or `malloc`) needing to know the object size! — Rollie, May 29 '13 at 18:36
@JAB in short, there is not a standard C++ ABI. See http://stackoverflow.com/q/2083060/1639256 and http://stackoverflow.com/a/7492291/1639256. — Oktalist, May 29 '13 at 19:52
I see. Sounds like a wrapper that exposes a C-compatible interface for the desired methods would be the best choice, then (which I actually ended up doing last year when using llvmpy and discovering that certain very useful things in the underlying LLVM API were not implemented, at which point I ended up writing some wrappers to access with `ctypes` as I hadn't had any experience with SWIG [still don't have any, actually. should probably fix that at some point].) ...Actually, does SWIG support C++/C++ bindings for cases like this? — JAB, May 29 '13 at 20:14

Hans Passant · Accepted Answer · 2013-05-29T23:46:32.200

It is not a problem. Because it is trumped by the bigger problem, you cannot create an object of that class in code that lives in a module other than the one that contains the code for the class. Code in another module cannot accurately know the required object size, their implementation of the std::string class may well be different. Which, as declared, also affects the size of the Customer object. Even the same compiler cannot guarantee this, mixing optimized and debugging builds of these modules for example. Albeit that this is usually pretty easy to avoid.

So you must create a class factory for Customer objects, a factory that lives in that same module. Which then automatically implies that any code that touches the "mName" member also lives in the same module. And is therefore safe.

Next step then is to not expose Customer at all but expose an pure abstract base class (aka interface). Now you can prevent the client code from creating an instance of Customer and shoot their leg off. And you'll trivially hide the std::string as well. Interface-based programming techniques are common in module interop scenarios. Also the approach taken by COM.

score 1 · Answer 2 · answered May 29 '13 at 18:39

1

As long as the allocator of instances of the class and deallocator are of the same settings, you should be ok, but you are right to avoid this.
Differences between the .exe and .dll as far as debug/release, code generation (Multi-threaded DLL vs. Single threaded) could cause problems in some scenarios.
I would recommend using abstract classes in the DLL interface with creation and deletion done solely inside the DLL.
Interfaces like:

class A {
protected:
  virtual ~A() {}
public:
  virtual void func() = 0;
};

//exported create/delete functions
A* create_A();
void destroy_A(A*);

DLL Implementation like:

class A_Impl : public A{
public:
  ~A_Impl() {}
  void func() { do_something(); }
}

A* create_A() { return new A_Impl; }
void destroy_A(A* a) { 
  A_Impl* ai=static_cast<A_Impl*>(a);
  delete ai;
}

Should be ok.

answered May 29 '13 at 18:39

Photon

3,182
1
15
16

Are you sure multi-threading/DLL/etc issues are a problem? I don't see how, as all code related to the container object is guaranteed to be executed using the DLL's runtime version, through some interface (i.e., `getName()`) – Rollie May 29 '13 at 18:45
Suppose that you have an inline set_value(const std::string& v) { m_value=v; } What happens is that the internal field gets a value allocated by the executable, and not the DLL. It's just good practice to avoid such problems altogether, rather than wasting time in debugging, because certain compilation settings / library issues are causing strange effects. – Photon May 29 '13 at 18:48
There's no need for the cast within `destroy_A()`. The destructor of `A` is virtual, so `~A_impl()` will be invoked if you `delete a;`. – Praetorian May 29 '13 at 18:50
The ~A is protected, to disallow the client to delete, so the destroy function is also limited in this sense. – Photon May 29 '13 at 18:51
@Photon I missed the part that `destroy_A()` is a free function. – Praetorian May 29 '13 at 18:52
@Photon that is only if you explicitly define a particular function as inline in the header, correct? – Rollie May 29 '13 at 18:54
The last time this issue bothered me a few years ago, I had a DLL with the following prototype for exported functions: std::string func(const std::string&). The returned value was allocated by the DLL and deallocated by the caller, causing crashes. I'm not saying you can't get away with what you're doing, I'm just saying that if you miss something, it bites you in the behind later. – Photon May 29 '13 at 18:58
The problem with this solution is that the structure of vtables is not guaranteed by the standard. So a client accessing your virtual methods may be looking for the vtable in a different place from where it was constructed by the DLL. This may be something you can get away with most of the time, but it's not truly safe. – James Holderness May 29 '13 at 19:03
That's good to know. Do you have an alternative that always works, except using an old C interfaces? – Photon May 29 '13 at 19:14
1

This answer, despite having upvotes, is incorrect. Different compilers have different ABIs. They usually align for C style interfaces, but seldom for C++ classes. – David Heffernan May 29 '13 at 19:21
1

@DavidHeffernan Isn't ABI compatibility for DLLs (-> Windows) using only ABC (with some restrictions) interfaces almost as "guaranteed" as C-style interfaces due to COM support? – dyp May 29 '13 at 20:31
@dyp Yes, that's a fair statement – David Heffernan May 29 '13 at 20:37

score 1 · Answer 3 · answered May 29 '13 at 19:05

Even if your class has no data members, you cannot expect it to be usable from code compiled with a different compiler. There is no common ABI for C++ classes. You can expect differences in name mangling just for starters.

If you are prepared to constrain clients to use the same compiler as you, or provide source to allow clients to compile your code with their compiler, then you can do pretty much anything across your interface. Otherwise you should stick to C style interfaces.

Emilio Garavaglia · Answer 4 · 2013-05-29T19:42:35.403

There are also two "potential bug" (among others) you must take care, since they are related to what is "under" the language.

The first is that std::strng is a template, and hence it is instantiated in every translation unit. If they are all linked to a same module (exe or dll) the linker will resolve same functions as same code, and eventually inconsistent code (same function with different body) is treated as error.
But if they are linked to different module (and exe and a dll) there is nothing (compiler and linker) in common. So -depending on how the module where compiled- you may have different implementation of a same class with different member and memory layout (for example one may have some debugging or profiling added features the other has not). Accessing an object created on one side with methods compiled on the other side, if you have no other way to grant implementation consistency, may end in tears.

The second problem (more subtle) relates to allocation/deallocaion of memory: because of the way windows works, every module can have a distinct heap. But the standard C++ does not specify how new and delete take care about which heap an object comes from. And if the string buffer is allocated on one module, than moved to a string instance on another module, you risk (upon destruction) to give the memory back to the wrong heap (it depends on how new/delete and malloc/free are implemented respect to HeapAlloc/HeapFree: this merely relates to the level of "awarness" the STL implementation have respect to the underlying OS. The operation is not itself destructive -the operation just fails- but it leaks the origin's heap).

All that said, it is not impossible to pass a container. It is just up to you to grant a consistent implementation between the sides, since the compiler and linker have no way to cross check.

You've not exhausted the issues. Mismatched ABIs is a killer. — David Heffernan, May 29 '13 at 19:11
Of course, but there where already other answers talking abut it. — Emilio Garavaglia, May 29 '13 at 19:36

score 0 · Answer 5 · answered May 29 '13 at 19:11

If you want to provide an object oriented interface in a DLL that is truly safe, I would suggest building it on top of the COM object model. That's what it was designed for.

Any other attempt to share classes between code that is compiled by different compilers has the potential to fail. You may be able to get something that seems to work most of the time, but it can't be guaraneteed to work.

The chances are that at some point you're going to be relying on undefined behaviour in terms of calling conventions or class structure or memory allocation.

score 0 · Answer 6 · edited May 23 '17 at 11:50

The C++ standard does not say anything about the ABI provided by implementations. Even on a single platform changing the compiler options may change binary layout or function interfaces.

Thus to ensure that standard types can be used across DLL boundaries it is your responsibility to ensure that either:

Resource Acquisition/Release for standard types is done by the same DLL. (Note: you can have multiple crt's in a process but a resource acquired by crt1.DLL must be released by crt1.DLL.)

This is not specific to C++. In C for example malloc/free, fopen/fclose call pairs must each go to a single C runtime.

This can be done by either of the below:

By explicitly exporting acquisition/release functions ( Photon's answer ). In this case you are forced to use a factory pattern and abstract types.Basically COM or a COM-clone
Forcing a group of DLL's to link against the same dynamic CRT. In this case you can safely export any kind of functions/classes.

Is it safe to use strings as private data members in a class used across a DLL boundry?

6 Answers6