61

I've seen both of the following two styles of declaring opaque types in C APIs. What are the various ways to declare opaque structs/pointers in C? Is there any clear advantage to using one style over the other?

Option 1

// foo.h
typedef struct foo * fooRef;
void doStuff(fooRef f);

// foo.c
struct foo {
    int x;
    int y;
};

Option 2

// foo.h
typedef struct _foo foo;
void doStuff(foo *f);

// foo.c
struct _foo {
    int x;
    int y;
};
Gabriel Staples
  • 36,492
  • 15
  • 194
  • 265
splicer
  • 5,344
  • 4
  • 42
  • 47
  • 4
    See also [Is it a good idea to typedef pointers?](http://stackoverflow.com/questions/750178/is-it-a-good-idea-to-typedef-pointers) – Jonathan Leffler Jun 05 '16 at 22:26
  • 1
    Also note that names starting with an underscore are not a good idea in user code (as opposed to system code — the implementation). §7.1.3 "Reserved identifiers" of the standard: _• All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use. • All identifiers that begin with an underscore are always reserved for use as identifiers with file scope in both the ordinary and tag name spaces._ – Jonathan Leffler Jun 05 '16 at 22:29
  • [Opaque type example](http://c-faq.com/struct/sd1.html) – mihai Nov 15 '16 at 07:49
  • (A little late to the party, I know, but) I just proposed a full example as `Option 1.5`, here: https://stackoverflow.com/a/54488289/4561887. – Gabriel Staples Feb 01 '19 at 23:09
  • 1
    Voting to re-open this question. Requesting various ways to declare and use opaque pointers to structs is not opinion-based. Rather, it simply shows various methods and techniques allowed by the language. – Gabriel Staples May 06 '21 at 21:33
  • Gabriel, I agree. It's ridiculous that this was closed! – splicer May 08 '21 at 03:34

4 Answers4

103

My vote is for the third option that mouviciel posted then deleted:

I have seen a third way:

// foo.h
struct foo;
void doStuff(struct foo *f);

// foo.c
struct foo {
    int x;
    int y;
};

If you really can't stand typing the struct keyword, typedef struct foo foo; (note: get rid of the useless and problematic underscore) is acceptable. But whatever you do, never use typedef to define names for pointer types. It hides the extremely important piece of information that variables of this type reference an object which could be modified whenever you pass them to functions, and it makes dealing with differently-qualified (for instance, const-qualified) versions of the pointer a major pain.

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • 7
    'Never' is rather strong here: the whole point of opaque types is to hide the implementation from users of your api, making changes to the former independant of the latter, and providing a measure of safety by restricting direct modifications by the user; I see nothing wrong with aliasing pointer types or hiding qualifiers in such cases (ie if they are implementation details) – Christoph Oct 19 '10 at 08:08
  • 2
    @Christoph: One problem with encapsulating e.g. `const` into the typedef is if you have a situation where you want to express that some API functions modify the object, and others don't. You could have `typedef struct foo *foo; typedef struct const foo *const_foo;`, but that's grim! – Oliver Charlesworth Oct 19 '10 at 10:42
  • 35
    Whether a type is a pointer or not is **not an implementation detail**. It's fundamental to the semantics of any operation in which you might use the type. This is one 'never' I stand by completely. – R.. GitHub STOP HELPING ICE Oct 19 '10 at 17:11
  • @R..: think about mutable vs immutable string implementations or non-compacting vs compacting collectors; the former is an example where `const` is an implementation detail, the latter where you need an additional level of indirection in the compacting case; the api may very well be agnostic to these details, but only if opaque types are used to hide qualifiers/pointers – Christoph Oct 20 '10 at 06:45
  • 5
    A type with a builtin `const` qualifier is **not valid** for immutable strings (or any allocated object) because your implementation of the object cannot `free` a `const`-qualified pointer (`free` takes a non-`const`-qualified `void *`, for good reason). This is not a technicality but a matter of violating the semantics of `const`. Sure you can cast the `const` away in your `immutable_string_free` function, but now we're getting into the territory of dirty hacks. **Any** opaque object allocation function should always return `footype *`, and the function to free should take `footype *`. – R.. GitHub STOP HELPING ICE Oct 20 '10 at 14:51
  • @R..: I agree that casting away `const` is generally not advisable and is - as far as I know - actually not mentioned in the C standard as a possible conversion for pointer types (you can only go from non-qualified to qualified); however, allocated objects which are never modified during their lifetime after initialization are common enough, and I don't really see the benefit of keeping a non-qualified version of the pointer around for the sole purpose of freeing the object; casting to non-`const` may be a hack, but I don't know of any other feasible workaround – Christoph Oct 20 '10 at 16:39
  • @Cristoph: The solution is not to use a workaround but for the "owner" of the object to store the correct, non-constant pointer type. If the object is immutable, the only place the non-`const`-qualified version of the pointer should be used is in the return value of the allocation function(s) and the argument of the deallocation function(s). All other functions should take `const`-qualified pointers. This is one reason why making `const` (and the pointer) a built-in part of the type definition is a bad idea. – R.. GitHub STOP HELPING ICE Oct 20 '10 at 19:00
  • @R..: see http://stackoverflow.com/questions/3999966/const-correctness-and-immutable-allocated-objects – Christoph Oct 22 '10 at 18:50
  • 16
    @R: Whether a type is a pointer or not **absolutely is an implementation detail**. Yes, being a pointer gives it certain semantics, but **those semantics are not peculiar to pointers**. If I expose a handle type from my library, and tell you that it persistently identifies a gadget, you do not, should not, and **must not care if it is a pointer** or an index into a private global array (or linked-list, to allow growth) inside my library, or magic. The only thing that matters is that it is properly documented as being an identifier for a persistent object. – Ben Voigt Dec 05 '10 at 01:08
  • @Ben Voigt: You seem to have missed R..'s point, which is that the constness of either the pointer or the object pointed to is significant. Your example doesn't change this. Const pointer to magic and const magic DO NOT have the same semantics in the caller's context and cannot be left up to the implementer's whim, because the implementer's problem domain is his code, not mine. To sum up: the implementer's injecting hidden consts into my code is an implementation failure. – Eric Towers Dec 10 '10 at 03:42
  • 5
    @Eric: Top-level `const` gets removed from the actual parameter, so neither "const pointer to magic" nor "const magic" restrict the library in any way whatsoever. And whether it's a "pointer to const magic" or a "pointer to non-const magic" is an implementation detail... it's not important to the caller's code in the least, because he's not supposed to touch the magic, not even supposed to dereference the pointer which is a necessary first step in touching the magic. – Ben Voigt Dec 10 '10 at 04:03
  • @EricTowers: While a "pointer to that will not be used directly or indirectly to change object", "pointer to object that will never change within its lifetime", and "pointer to object that will exist forever without change" might be useful qualifiers for optimization, no such concepts exist for pointers that are not also "restrict" qualified. Otherwise, compilers are required to presume that if they can't see everything that is done with a pointer, the pointers might be used to modify the objects identified thereby. – supercat May 19 '17 at 16:20
  • 1
    It used to be, I'm not sure if it is the case anymore, that it was a compiler error to typedef an enum/struct/union identifier to itself. Borland or Microsoft IIRC. Microsoft IIRC required `_foo` even when declaring `typedef _foo struct {} foo;` in C. –  Jan 24 '19 at 07:06
7

Option 1.5 ("Object-based" C Architecture):

I am accustomed to using Option 1, except where you name your reference with _h to signify it is a "handle" to a C-style "object" of this given C "class". Then, you ensure your function prototypes use const wherever the content of this object "handle" is an input only, and cannot be changed, and don't use const wherever the content can be changed. So, do this style:

// -------------
// my_module.h
// -------------

// An opaque pointer (handle) to a C-style "object" of "class" type 
// "my_module" (struct my_module_s *, or my_module_h):
typedef struct my_module_s *my_module_h;

void doStuff1(my_module_h my_module);
void doStuff2(const my_module_h my_module);

// -------------
// my_module.c
// -------------

// Definition of the opaque struct "object" of C-style "class" "my_module".
struct my_module_s
{
    int int1;
    int int2;
    float f1;
    // etc. etc--add more "private" member variables as you see fit
};

Here's a full example using opaque pointers in C to create objects. The following architecture might be called "object-based C":

//==============================================================================================
// my_module.h
//==============================================================================================

// An opaque pointer (handle) to a C-style "object" of "class" type "my_module" (struct
// my_module_s *, or my_module_h):
typedef struct my_module_s *my_module_h;

// Create a new "object" of "class" "my_module": A function that takes a *pointer to* an
// "object" handle, `malloc`s memory for a new copy of the opaque  `struct my_module_s`, then
// points the user's input handle (via its passed-in pointer) to this newly-created  "object" of
// "class" "my_module".
void my_module_open(my_module_h * my_module_h_p);

// A function that takes this "object" (via its handle) as an input only and cannot modify it
void my_module_do_stuff1(const my_module_h my_module);

// A function that can modify the private content of this "object" (via its handle) (but still
// cannot modify the  handle itself)
void my_module_do_stuff2(my_module_h my_module);

// Destroy the passed-in "object" of "class" type "my_module": A function that can close this
// object by stopping all operations, as required, and `free`ing its memory.
void my_module_close(my_module_h my_module);

//==============================================================================================
// my_module.c
//==============================================================================================

// Definition of the opaque struct "object" of C-style "class" "my_module".
// - NB: Since this is an opaque struct (declared in the header but not defined until the source
// file), it has the  following 2 important properties:
// 1) It permits data hiding, wherein you end up with the equivalent of a C++ "class" with only
// *private* member  variables.
// 2) Objects of this "class" can only be dynamically allocated. No static allocation is
// possible since any module including the header file does not know the contents of *nor the
// size of* (this is the critical part) this "class" (ie: C struct).
struct my_module_s
{
    int my_private_int1;
    int my_private_int2;
    float my_private_float;
    // etc. etc--add more "private" member variables as you see fit
};

void my_module_open(my_module_h * my_module_h_p)
{
    // Ensure the passed-in pointer is not NULL (since it is a core dump/segmentation fault to
    // try to dereference  a NULL pointer)
    if (!my_module_h_p)
    {
        // Print some error or store some error code here, and return it at the end of the
        // function instead of returning void.
        goto done;
    }

    // Now allocate the actual memory for a new my_module C object from the heap, thereby
    // dynamically creating this C-style "object".
    my_module_h my_module; // Create a local object handle (pointer to a struct)
    // Dynamically allocate memory for the full contents of the struct "object"
    my_module = malloc(sizeof(*my_module)); 
    if (!my_module) 
    {
        // Malloc failed due to out-of-memory. Print some error or store some error code here,
        // and return it at the end of the function instead of returning void.   
        goto done;
    }

    // Initialize all memory to zero (OR just use `calloc()` instead of `malloc()` above!)
    memset(my_module, 0, sizeof(*my_module));

    // Now pass out this object to the user, and exit.
    *my_module_h_p = my_module;

done:
}

void my_module_do_stuff1(const my_module_h my_module)
{
    // Ensure my_module is not a NULL pointer.
    if (!my_module)
    {
        goto done;
    }

    // Do stuff where you use my_module private "member" variables.
    // Ex: use `my_module->my_private_int1` here, or `my_module->my_private_float`, etc. 

done:
}

void my_module_do_stuff2(my_module_h my_module)
{
    // Ensure my_module is not a NULL pointer.
    if (!my_module)
    {
        goto done;
    }

    // Do stuff where you use AND UPDATE my_module private "member" variables.
    // Ex:
    my_module->my_private_int1 = 7;
    my_module->my_private_float = 3.14159;
    // Etc.

done:
}

void my_module_close(my_module_h my_module)
{
    // Ensure my_module is not a NULL pointer.
    if (!my_module)
    {
        goto done;
    }

    free(my_module);

done:
}

Simplified example usage:

#include "my_module.h"

#include <stdbool.h>
#include <stdio.h>

int main()
{
    printf("Hello World\n");

    bool exit_now = false;

    // setup/initialization
    my_module_h my_module = NULL;
    // For safety-critical and real-time embedded systems, it is **critical** that you ONLY call
    // the `_open()` functions during **initialization**, but NOT during normal run-time,
    // so that once the system is initialized and up-and-running, you can safely know that
    // no more dynamic-memory allocation, which is non-deterministic and can lead to crashes,
    // will occur.
    my_module_open(&my_module);
    // Ensure initialization was successful and `my_module` is no longer NULL.
    if (!my_module)
    {
        // await connection of debugger, or automatic system power reset by watchdog
        log_errors_and_enter_infinite_loop(); 
    }

    // run the program in this infinite main loop
    while (exit_now == false)
    {
        my_module_do_stuff1(my_module);
        my_module_do_stuff2(my_module);
    }

    // program clean-up; will only be reached in this case in the event of a major system 
    // problem, which triggers the infinite main loop above to `break` or exit via the 
    // `exit_now` variable
    my_module_close(my_module);

    // for microcontrollers or other low-level embedded systems, we can never return,
    // so enter infinite loop instead
    while (true) {}; // await reset by watchdog

    return 0;
}

The only improvements beyond this would be to:

  1. Implement full error handling and return the error instead of void. Ex:

     /// @brief my_module error codes
     typedef enum my_module_error_e
     {
         /// No error
         MY_MODULE_ERROR_OK = 0,
    
         /// Invalid Arguments (ex: NULL pointer passed in where a valid pointer is required)
         MY_MODULE_ERROR_INVARG,
    
         /// Out of memory
         MY_MODULE_ERROR_NOMEM,
    
         /// etc. etc.
         MY_MODULE_ERROR_PROBLEM1,
     } my_module_error_t;
    

    Now, instead of returning a void type in all of the functions above and below, return a my_module_error_t error type instead!

  2. Add a configuration struct called my_module_config_t to the .h file, and pass it in to the open function to update internal variables when you create a new object. This helps encapsulate all configuration variables in a single struct for cleanliness when calling _open().

    Example:

     //--------------------
     // my_module.h
     //--------------------
    
     // my_module configuration struct
     typedef struct my_module_config_s
     {
         int my_config_param_int;
         float my_config_param_float;
     } my_module_config_t;
    
     my_module_error_t my_module_open(my_module_h * my_module_h_p, 
                                      const my_module_config_t *config);
    
     //--------------------
     // my_module.c
     //--------------------
    
     my_module_error_t my_module_open(my_module_h * my_module_h_p, 
                                      const my_module_config_t *config)
     {
         my_module_error_t err = MY_MODULE_ERROR_OK;
    
         // Ensure the passed-in pointer is not NULL (since it is a core dump/segmentation fault
         // to try to dereference  a NULL pointer)
         if (!my_module_h_p)
         {
             // Print some error or store some error code here, and return it at the end of the
             // function instead of returning void. Ex:
             err = MY_MODULE_ERROR_INVARG;
             goto done;
         }
    
         // Now allocate the actual memory for a new my_module C object from the heap, thereby
         // dynamically creating this C-style "object".
         my_module_h my_module; // Create a local object handle (pointer to a struct)
         // Dynamically allocate memory for the full contents of the struct "object"
         my_module = malloc(sizeof(*my_module)); 
         if (!my_module) 
         {
             // Malloc failed due to out-of-memory. Print some error or store some error code
             // here, and return it at the end of the function instead of returning void. Ex:
             err = MY_MODULE_ERROR_NOMEM;
             goto done;
         }
    
         // Initialize all memory to zero (OR just use `calloc()` instead of `malloc()` above!)
         memset(my_module, 0, sizeof(*my_module));
    
         // Now initialize the object with values per the config struct passed in. Set these
         // private variables inside `my_module` to whatever they need to be. You get the idea...
         my_module->my_private_int1 = config->my_config_param_int;
         my_module->my_private_int2 = config->my_config_param_int*3/2;
         my_module->my_private_float = config->my_config_param_float;        
         // etc etc
    
         // Now pass out this object handle to the user, and exit.
         *my_module_h_p = my_module;
    
     done:
         return err;
     }
    

    And usage:

     my_module_error_t err = MY_MODULE_ERROR_OK;
    
     my_module_h my_module = NULL;
     my_module_config_t my_module_config = 
     {
         .my_config_param_int = 7,
         .my_config_param_float = 13.1278,
     };
     err = my_module_open(&my_module, &my_module_config);
     if (err != MY_MODULE_ERROR_OK)
     {
         switch (err)
         {
         case MY_MODULE_ERROR_INVARG:
             printf("MY_MODULE_ERROR_INVARG\n");
             break;
         case MY_MODULE_ERROR_NOMEM:
             printf("MY_MODULE_ERROR_NOMEM\n");
             break;
         case MY_MODULE_ERROR_PROBLEM1:
             printf("MY_MODULE_ERROR_PROBLEM1\n");
             break;
         case MY_MODULE_ERROR_OK:
             // not reachable, but included so that when you compile with 
             // `-Wall -Wextra -Werror`, the compiler will fail to build if you forget to handle
             // any of the error codes in this switch statement.
             break;
         }
    
         // Do whatever else you need to in the event of an error, here. Ex:
         // await connection of debugger, or automatic system power reset by watchdog
         while (true) {}; 
     }
    
     // ...continue other module initialization, and enter main loop
    

See also:

  1. [another answer of mine which references my answer above] Architectural considerations and approaches to opaque structs and data hiding in C

Additional reading on object-based C architecture:

  1. Providing helper functions when rolling out own structures

Additional reading and justification for valid usage of goto in error handling for professional code:

  1. An argument in favor of the use of goto in C for error handling: https://github.com/ElectricRCAircraftGuy/eRCaGuy_dotfiles/blob/master/Research_General/goto_for_error_handling_in_C/readme.md
  2. *****EXCELLENT ARTICLE showing the virtues of using goto in error handling in C: "Using goto for error handling in C" - https://eli.thegreenplace.net/2009/04/27/using-goto-for-error-handling-in-c
  3. Valid use of goto for error management in C?
  4. Error handling in C code

Search terms to make more googlable: opaque pointer in C, opaque struct in C, typedef enum in C, error handling in C, c architecture, object-based c architecture, dynamic memory allocation at initialization architecture in c

Gabriel Staples
  • 36,492
  • 15
  • 194
  • 265
  • 2
    This example was almost perfect, until I saw.......goto. Really? – Michi Jan 05 '20 at 10:02
  • 3
    Yes, really. I used to be really anti goto too, until I started using it professionally. Now that I've written tons & tons of C code which does long & complicated error checking, I have concluded it is the best way to handle error checking, period, and there is no equivalent alternative which makes code as safe & readable and easy to write as goto does. If only you were here with me we could sit down together & I'd spend 1 hr + with you to go over many many examples where the virtues of goto used in this way (& only this way) really shine through, & I think you'd become a convert & use it too. – Gabriel Staples Jan 05 '20 at 15:47
  • See a couple really good examples here (https://eli.thegreenplace.net/2009/04/27/using-goto-for-error-handling-in-c) and here (https://stackoverflow.com/questions/788903/valid-use-of-goto-for-error-management-in-c). This Google search is really good too: "goto for error checking in c". – Gabriel Staples Jan 05 '20 at 15:57
  • There is no code which you can not do it without goto, but this depends on which coding level we are talking about. This is one reason why a lot of new (and not only new coders) they do not understand function pointers or function which returns function pointers, but this is definitely not the case now to talk about it. Any way I upvoted your answer. – Michi Jan 10 '20 at 21:48
  • I don't really like the idea of goto-statements myself, but at least I'm **consistent**.. I don't really like the way exceptions work either - for the exact same reasons. Maybe even more so then goto-statements since it's not only adds unnatural execution flows but the way n00b-devs just skip all notion of input validation or error checks and just push exceptions down from 50+ nested library calls.. The notMyProblem(tm) mentality... -.- – Christoffer Bubach Jan 12 '20 at 00:09
  • As a beginner in C (advanced in Python though) I understand that `goto` is something to stay far far away and everybody dislike it, however I can see the use once the understanding of the language is a top level, will check better these articles and this answer later when I'll have more experience in C, – Federico Baù Dec 30 '20 at 16:31
  • 1
    @FedericoBaù, this isn't quite true (`I understand that goto is something to stay far far away and everybody dislike it,`), but it's definitely an area of contention. As I've programmed professionally in both embedded C and application level C++, I've come to realize that professional developers (myself included) become very very opinionated over time. Some professional software developer teams have declared: "`goto` is the best tool for error handling in C and you SHALL use it." Also, many C developers abhor C++ with a passion, and, many C++ developers abhor C styles in C++ with a passion. – Gabriel Staples Dec 30 '20 at 17:18
  • 2
    Both of these views: C++ developers hating C styles, and C developers hating C++, are wrong in my opinion. My favorite way to write "C" is to use the **C++** compiler, because I can write far more beautiful code that looks like C (but is actually C++) with the C++ compiler than I ever could with the C compiler. Regarding `goto`: the community is split. `goto` is mis-taught in school. **To say it is evil and should NEVER be used is...well...evil, and should NEVER be said. :)** It has its place, when used properly. See my article and other justification in the links in the bottom of my answer. – Gabriel Staples Dec 30 '20 at 17:19
  • For more on some of this C and C++ debate in general, read ["My Thoughts on C++"](https://github.com/ElectricRCAircraftGuy/eRCaGuy_dotfiles/blob/master/git%20%26%20Linux%20cmds%2C%20help%2C%20tips%20%26%20tricks%20-%20Gabriel.txt#L54) here. @FedericoBaù, since you are from Spain, I think you will like my analogy in that article where I begin: "Imagine **C is Spanish**."... – Gabriel Staples Dec 30 '20 at 17:28
  • 1
    @Gabriel Staples, it must be the way I express the comment but I was actually agree completely to what you stated, what I meant is that as a beginner in C and learning it I'm exposed to what i Found around the internet in order to learn it, as so far I mostly encountered a bad view regarding the `goto`(hence my phrase). So I bump into your answer and I actually found interesting (because again, mainly i see around that is "evil"). I believe now that is a tool that is better left when becoming more advanced (so not where I'm currently) – Federico Baù Dec 30 '20 at 17:35
  • @Gabriel Staples thanks for the info, I saved them for future reference (at least as intermediate level) – Federico Baù Dec 30 '20 at 17:37
  • 1
    I think `goto` keyword is very useful and makes the code more readable. Perfect Answer. – pylover Jan 27 '21 at 14:10
  • [Nearly 3 years later now] Now that I've written a lot of C++ too, I'll add that the advantages of `goto` are more-clearly seen in C than in C++. C++ has other cleanup mechanisms, for example, such as _class destructors_, which can help instead. – Gabriel Staples Dec 02 '21 at 19:30
  • That being said, I'm not afraid to use C styles in C++, depending on the need and situation. If C were Spanish, C++ would be Spanish + French + German + English + Italian + Portuguese all in one ([search this doc I wrote](https://github.com/ElectricRCAircraftGuy/eRCaGuy_dotfiles/blob/master/git%20%26%20Linux%20cmds%2C%20help%2C%20tips%20%26%20tricks%20-%20Gabriel.txt) for "Spanish"). C++ has many valid styles and forms of syntax and architectural patterns, and with only a few minor differences and caveats, "C" is technically one of them. – Gabriel Staples Dec 02 '21 at 19:34
1

bar(const fooRef) declares an immutable address as argument. bar(const foo *) declares an address of an immutable foo as argument.

For this reason, I tend to prefer option 2. I.e., the presented interface type is one where cv-ness can be specified at each level of indirection. Of course one can sidestep the option 1 library writer and just use foo, opening yourself to all sorts of horror when the library writer changes the implementation. (I.e., the option 1 library writer only perceives that fooRef is part of the invariant interface and that foo can come, go, be altered, whatever. The option 2 library writer perceives that foo is part of the invariant interface.)

I'm more surprised that no one's suggested combined typedef/struct constructions.
typedef struct { ... } foo;

tijko
  • 7,599
  • 11
  • 44
  • 64
Eric Towers
  • 4,175
  • 1
  • 15
  • 17
  • 5
    Regarding your last sentence, these constructions do not admit opaque types. If you use them, you're exposing the definition of the structure in your header for the calling application to abuse. – R.. GitHub STOP HELPING ICE Oct 19 '10 at 04:36
  • In neither option is the layout of `foo` part of the interface. That's the whole point of doing things this way. – Ben Voigt Dec 05 '10 at 01:09
0

Option 3: Give people choice

/*  foo.h  */

typedef struct PersonInstance PersonInstance;

typedef struct PersonInstance * PersonHandle;

typedef const struct PersonInstance * ConstPersonHandle;

void saveStuff (PersonHandle person);

int readStuff (ConstPersonHandle person);

...


/*  foo.c  */

struct PersonInstance {
    int a;
    int b;
    ...
};

...
madmurphy
  • 1,451
  • 11
  • 20