1

I'm creating a library and I want to both hide and protect (make unwritable) a structure type (can only edit using library functions). I'm using:

typedef _STRINGLIB_STRING_CONST_ struct
{
    #if defined(_STRINGLIB_OVERLOADING_WAIT_) || defined(_STRINGLIB_STRING_ACCESSIBLE_)
    //actual string
    char *getString;
    //string size in bytes
    size_t getSize;
    //string size in characters (=size in bytes-1)
    size_t getSizeChars;
    //should both point to _STRINGLIB_ALLOCATION_TRUE_ when string is allocated
    void *stringAllocation;
    void *stringSignature;
    #else
    struct
    {
        size_t _block_a;
        void *_block_b;
        void *_block_c;
        void *_block_d;
        size_t _block_e;
    }_block;
    #endif // _STRINGLIB_OVERLOADING_WAIT_
}String;

where _STRINGLIB_STRING_CONST_ is always const unless the header is called from source file or the user declared _STRINGLIB_STRING_ACCESSIBLE_ before including the library; _STRINGLIB_OVERLOADING_WAIT_ is declared by the source file (and undeclared right after).

It looks like it works fine (i'm using GCC 4.9.2) but I want to be sure whether this is always OK or undefined behaviour.

.

.

ADDITIONAL INFORMATION: forgot to mention i have these definitions on top of the stringlib.h file:

#ifdef _STRINGLIB_OVERLOADING_WAIT_
#undef _STRINGLIB_STRING_CONST_
#define _STRINGLIB_STRING_CONST_
#elif !defined(_STRINGLIB_STRING_ACCESSIBLE_)
#undef _STRINGLIB_STRING_CONST_
#define _STRINGLIB_STRING_CONST_ const
#else
#undef _STRINGLIB_STRING_CONST_
#define _STRINGLIB_STRING_CONST_
#endif // _STRINGLIB_OVERLOADING_WAIT_

#ifndef _STRINGLIB_H_
#define _STRINGLIB_H_
//<...>
Alex Sim
  • 403
  • 3
  • 16
  • 1
    Struct alignment could be different because you're not ordering your structure fields the same way, so size of the structure could be seen as different, which is bad. – Jean-François Fabre Mar 11 '17 at 13:11
  • 2
    What you want is an "opaque structure". See http://stackoverflow.com/questions/3965279/opaque-c-structs-how-should-they-be-declared for some ideas on how to do that. – Andrew Henle Mar 11 '17 at 13:16
  • Note that you should be cautious about using names starting with an underscore and capital letter. They're reserved unconditionally for the implementation to do with as they please — your code could break unless it's part of the implementation. If it breaks, you have no comeback on anyone. You can't safely/reliably port the code to anywhere where you're not a part of the implementation. Avoid leading underscores in general. There a numerous questions where the details are spelled out (e.g. [What does double underscore (`__const`) mean in C](http://stackoverflow.com/questions/1449181/)). – Jonathan Leffler Mar 12 '17 at 02:54
  • @JonathanLeffler: I wish the authors of the C Standard would reserve a form of identifiers that would assign ownership of certain names to the creators of various implementations; names might not be "glamorous", but someone who wants to support a new feature on his C compiler got all names starting with _STDCX493167_ and added an intrinsic to e.g. set an lvalue to an Unspecified value, people could write code which would use that intrinsic when present while still being compatible with compilers where it was absent. – supercat Mar 16 '17 at 22:05

3 Answers3

2

It is not undefined behaviour: it is perfectly well defined what happens. However it is still not a very good idea because the definition of the struct depends on what plaform you used and how you compiled it. This implicit dependency is carried through in all places that use or reference the struct type. This is more or less the ABI of the struct.

This means that if during compilation #1 defined(_STRINGLIB_OVERLOADING_WAIT_) is false, but during subsequent compilation #2 it is true the resulting code of both compilations may not be able to work with each other correctly. The ABIs of #1 and #2 differ, so they are not compatible: mixing them is what leads to undefined behaviour.

So the answer is: this is not undefined behaviour, until you do other things (which you may not even be aware of doing) that mix ABIs. At that point you probably have undefined behaviour lurking in your code.

Your compiler and linker will not necessarily detect this (especially not with pointers to structs) which may lead to mysterious and hard to track down crashes when code written for ABI #1 does things that don't work for ABI #2 or vice versa. (Invalid read/writes being the most obvious, but also things like sizeof() may not work correctly.)

user268396
  • 11,576
  • 2
  • 31
  • 26
  • the _STRINGLIB_OVERLOADING_WAIT_ macro is not to be defined by the user, only by the library source file. Even so, both structures have the same size, only mixed up types (_block variables are not to be read both by the user and by the library) – Alex Sim Mar 11 '17 at 13:19
  • @AlexSim but unfortunately it is not as simple. The fact that you have a `define` with branches based on that means it can be defined *differently* depending on *something*. And at that point you are (implicitly) back where you started: at square 1 with potentially incompatible ABIs. That is why it is not such a good idea in general: the code is written in a way that is susceptible to (subtle) breakage that the user cannot easily anticipate or detect. – user268396 Mar 11 '17 at 13:22
  • what do you mean by "defined differently depending on something"? how could the macro possibly change without the user changing it on purpose (in which case, by the way, it's totally his fault) – Alex Sim Mar 11 '17 at 13:27
  • BTW, the stringlib.c file defines _STRINGLIB_OVERLOADING_WAIT_ macro before ingluding stringlib.h and undefines it right after – Alex Sim Mar 11 '17 at 13:32
  • One trick that can sometimes be helpful is to have a header define macros which map documented API names to other names that depend upon the compilation flags in effect when the header is processed. In some cases, this may allow multiple versions of the API to be used within the same project transparently (each calling functions that are suitable for its purpose). In other cases where that won't work, it may cause linker errors which, while not as nice as a working program, are more helpful than a program which links but doesn't actually work. – supercat Mar 16 '17 at 21:28
2

As others have mentioned, having a set of #define's that change the definition of the struct can lead to undefined behavior. Just because the user shouldn't change a particular set of defines doesn't mean they can't. You won't want to give them that flexibility.

The best way to handle this, as mentioned in the comments is with an opaque type.

In your header file you would declare the struct but not define it:

typedef struct String String;

The user would be unable to even create an variable of this type. However, they can create a pointer to this type which they can't dereference (since there's no definition).

Your library source would contain the actual definition of the struct:

struct String {
    char *getString;
    size_t getSize;
    size_t getSizeChars;
    void *stringAllocation;
    void *stringSignature;
};

The functions in the library would accept a String * which could then operate on it. Included in this library would be a creation function which returns a String * to the user and a destroy function which accepts a String * that would deallocate it.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • that would not work if i wanted, for example, to insert my string type inside another struct, as it doesn't have a specified size; that said the user shouldn't be able to see and edit the structure BY DEFAULT, but can enable it with a macro – Alex Sim Mar 11 '17 at 14:15
0

As far as I can see, it is undefined behaviour.

The client code, presumably, declares Strings:

 String s;

These are declared to be const. You then (again, presumably) pass the address of that object to a library function;

string_set(&s, "abc");

And the library function's implementation treats the received pointer as though it were mutable.

Mutating an object which was originally declared const is UB. Although with GCC, you may only experience the problem with file-scoped Strings, it is still UB even for Strings with automatic allocation. (The constness of the struct might be taken into account for optimisation.)

Also, your two alternatives disagree in the order of the types, so they are not compatible. And char* is not compatible with void*. So those are also problematic, although with GCC you won't see any problem. (It would be a problem if a pointer were eight bytes with an eight-byte alignment, and size_t were only four bytes with a four-byte alignment. That would make the size of the "fake" object 40 bytes, including two four-byte paddings, while the "real" object, which has the size_ts together, would fit into 32 paddingless bytes. Even without that justification, they are incompatible structs and that's UB.)

rici
  • 234,347
  • 28
  • 237
  • 341
  • the alternatives disagree on purpose, the second one is made to have the exact same size of the first one, not to be readable nor writeable; as for the UB topic, which compiler would you recommend to test this out? – Alex Sim Mar 11 '17 at 14:08
  • @alex: I don't believe you can test for UB so I can't really answer that question. I honestly don't see the point of shuffling the types; encapsulation is a programming technique, nothing more. But even if you fervently believe that the size and alignment of size_t and a pointer are ordained by some deity to be the same, the standard disagrees and the result is that your types are not consistent. If you are doing this as a hobby, that's up to you. If you have other plans, it will probably have to go through code review, if not automatic verification. I'm only answering the question. – rici Mar 11 '17 at 14:50
  • i did not say the size of size_t and pointers are necessarily the same, i said the size in bytes of both structures are the same (size of 2 size_t and 3 poiners). i don't need the second structure's type oder to coincide with the first as the second structure values are not intended to be accessed directly – Alex Sim Mar 11 '17 at 15:52