Providing helper functions when rolling out own structures

Question

if I am developing a C shared library and I have my own structs. To make common operations on these struct instances easier for library consumers, can I provide function pointers to such functions inside the struct itself? Is it a good practice? Would there be issues with respect to multithreading where a utility function is called in parallel with different arguments and so on?

I know it goes a lot closer to C++ classes but I wish to stick to C and learn how it would be done in a procedural language as opposed to OOP.

To give an example

typedef struct tag tag;
typedef struct my_custom_struct my_custom_struct;

struct tag
{
    // ...
};

struct my_custom_struct
{
    tag *tags;
    my_custom_struct* (*add_tag)(my_custom_struct* str, tag *tag);  
};

my_custom_struct* add_tag(my_custom_struct* str, tag *tag)
{
    // ...
}

where add_tag is a helper that manages to add the tag to tag list inside *str. I saw this pattern in libjson-c like here- http://json-c.github.io/json-c/json-c-0.13.1/doc/html/structarray__list.html. There is a function pointer given inside array_list to help free it.

Why do you need pointers to functions inside the structures? Wouldn't it be simpler just to document the APIs for the functions? Using pointers to functions in the structures becomes helpful if the same structure can contain different pointers. Otherwise, it is obfuscation for no immediately apparent benefit. (C is not a functional language; it is a procedural language.) — Jonathan Leffler, Aug 08 '19 at 15:01
As someone who had to maintain a C code base trying to emulate inheritance and C++ classes using structs, opaque pointers, and explicit vtables, please, don't go down that rabbit hole. It was an absolute nightmare. It was extremely hard to debug as using function pointers adds levels of indirection, making it hard to find where breakpoints needed to be placed. — SirDarius, Aug 08 '19 at 15:07
Side note: C is an *imperative* language. *Functional* languages are a different thing (canonical example: Lisp). Java and C++ have more functional programming character and facilities than does C. — John Bollinger, Aug 08 '19 at 15:09
You can, and you should not. If the function is the same for every instance of a struct in the struct type, the pointer is a waste of space, and of time to initialize it, copy it when the structure is copied, and so on. There are other methods to associate functions with types, starting with simply using a common prefix for their names. — Eric Postpischil, Aug 08 '19 at 15:30
Thanks, I get a sense from multiple comments and it seems like anti-pattern and an overhead. I will provide function definitions separately as a header file. — upInCloud, Aug 08 '19 at 15:32
If you're placing a function pointer in a `struct mystruct` just so you can do `o->func(o)` instead of `mystruct__func(o)` because *"object-oriented programming"*, you're doing it wrong. Some would also discourage typedefing because of worse code readability. — Petr Skocik, Aug 08 '19 at 15:32
I recommend adding `architecture` as a tag to this question. I just added a rather lengthy answer here too: https://stackoverflow.com/questions/57415496/providing-helper-functions-when-rolling-out-own-structures/57419532#57419532. — Gabriel Staples, Aug 08 '19 at 19:56

score 4 · Accepted Answer · answered Aug 08 '19 at 15:44

To make common operations on these struct instances easier for library consumers, can I provide function pointers to such functions inside the struct itself?

It is possible to endow your structures with members that are function pointers, pointing to function types whose parameters include pointers to your structure type, and that are intended to be used more or less like C++ instance methods, more or less as presented in the question.

Is it a good practice?

TL;DR: no.

The first problem you will run into is getting those pointer members initialized appropriately. Name correspondence notwithstanding, the function pointers in instances of your structure will not automatically be initialized to point to a particular function. Unless you make the structure type opaque, users can (and undoubtedly sometimes will) declare instances without calling whatever constructor-analog function you provide for the purpose, and then chaos will ensue.

If you do make the structure opaque (which after all isn't a bad idea), then you'll need non-member functions anyway, because your users won't be able to access the function pointers directly. Perhaps something like this:

struct my_custom_struct *my_add_tag(struct my_custom_struct *str, tag *tag) {
    return str->add_tag(str, tag);
}

But if you're going to provide for that, then what's the point of the extra level of indirection? (Answer: the only good reason for that would be that in different instances, the function pointer can point to different functions.)

And similar applies if you don't make the structure opaque. Then you might suppose that users would (more) directly call

str->add_tag(str, tag);

but what exactly makes that a convenience with respect to simply

add_tag(str, tag);

?

So overall, no, I would not consider this approach a good practice in general. There are limited circumstances where it may make sense to do something along these lines, but not as a general library convention.

Would there be issues with respect to multithreading where a utility function is called in parallel with different arguments and so on?

Not more so than with functions designated any other way, except if the function pointers themselves are being modified.

I know it goes a lot closer to C++ classes but I wish to stick to C and learn how it would be done in a procedural language as opposed to OOP.

If you want to learn C idioms and conventions then by all means do so. What you are describing is not one. C code and libraries can absolutely be designed with use of OO principles such as encapsulation, and to some extent even polymorphism, but it is not conventionally achieved via the mechanism you describe. This answer touches on some of the approaches that are used for the purpose.

Thank you for the detailed answer and guidance. I am new to the paradigms of C programming and patterns. This definitely helped me understanding how the design choices are made. — upInCloud, Aug 08 '19 at 16:26

Gabriel Staples · Answer 2 · 2019-12-11T00:19:03.990

Is it a good practice?

TLDR; no.

Background:

I've been programming almost exclusively in embedded C on STM32 microcontrollers for the last year and a half (as opposed to using C++ or "C+", as I'll describe below). It's been very insightful for me to have to learn C at the architectural level, like I have. I've studied C architecture pretty hard to get to where I can say I "know C". It turns out, as we all know, C and C++ are NOT the same language. At the syntax level, C is almost exactly a subset of C++ (with some key differences where C supports stuff C++ does not), hence why people (myself included before this) frequently think/thought they are pretty much the same language, but at the architectural level they are VASTLY DIFFERENT ANIMALS.

Aside:

Note that my favorite approach to embedded is to use what some colloquially know as "C+". It is basically using a C++ compiler to write C-style embedded code. You basically just write C how you'd expect to write C, except you use C++ classes to vastly simplify the (otherwise pure C) architecture. In other words, "C+" is a pseudonym used to describe using a C++ compiler to write C-like code that uses classes instead of "object-based C" architecture (which is described below). You may also use some advanced C++ concepts on occasion, like operator overloading or templates, but avoid the STL for the most part to not accidentally use dynamic allocation (behind-the-scenes and automatically, like C++ vectors do, for example) after initialization, since dynamic memory allocation/deallocation in normal run-time can quickly use up scarce RAM resources and make otherwise-deterministic code non-deterministic. So-called "C+" may also include using a mix of C (compiled with the C compiler) and C++ (compiled with the C++ compiler), linked together as required (don't forget your extern "C" usage in C header files included in your C++ code, as required).

The core Arduino source code (again, the core, not necessarily their example "sketches" or example code for beginners) does this really well, and can be used as a model of good "C+" design. <== before you attack me on this, go study the Arduino source code for dozen of hours like I have [again, NOT the example "sketches", but their actual source code, linked-to below], and drop your "arduino is for beginners" pride right now.

The AVR core (mix of C and "C+"-style C++) is here: https://github.com/arduino/ArduinoCore-avr/tree/master/cores/arduino
Some of the core libraries ("C+"-style C++) are here: https://github.com/arduino/ArduinoCore-avr/tree/master/libraries

[aside over]

Architectural C notes:

So, regarding C architecture (ie: actual C, NOT "C+"/C-style C++):

C is not an OO language, as you know, but it can be written in an "object-based" style. Notice I say "object-based", NOT "object oriented", as that's how I've heard other pedantic C programmers refer to it. I can say I write object-based C architecture, and it's actually quite interesting.

To make object-based C architecture, here's a few things to remember:

Namespaces can be done in C simply by prepending your namespace name and an underscore in front of something. That's all a namespace really is after-all. Ex: mylibraryname_foo(), mylibraryname_bar(), etc. Apply this to enums, for example, since C doesn't have "enum classes" like C++. Apply it to all C class "methods" too since C doesn't have classes. Apply to all global variables or defines as well that pertain to a particular library.
When making C "classes", you have 2 major architectural options, both of which are very valid and widely used:
1. Use public structs (possibly hidden in headers named "myheader_private.h" to give them a pseudo-sense of privacy)
2. Use opaque structs (frequently called "opaque pointers" since they are pointers to opaque structs)
When making C "classes", you have the option of wrapping up pointers to functions inside of your structs above to give it a more "C++" type feel. This is somewhat common, but in my opinion a horrible idea which makes the code nearly impossible to follow and very difficult to read, understand, and maintain.

1st option, public structs:

Make a header file with a struct definition which contains all your "class data". I recommend you do NOT include pointers to functions (will discuss later). This essentially gives you the equivalent of a "C++ class where all members are public." The downside is you don't get data hiding. The upside is you can use static memory allocation of all of your C "class objects" since your user code which includes these library headers knows the full specification and size of the struct.

2nd option: opaque structs:

In your library header file, make a forward declaration to a struct:

/// Opaque pointer (handle) to C-style "object" of "class" type mylibrarymodule:
typedef struct mylibrarymodule_s *mylibrarymodule_h;

In your library .c source file, provide the full definition of the struct mylibrarymodule_s. Since users of this library include only the header file, they do NOT get to see the full implementation or size of this opaque struct. That is what "opaque" means: "hidden". It is obfuscated, or hidden away. This essentially gives you the equivalent of a "C++ class where all members are private." The upside is you get true data hiding. The downside is you can NOT use static memory allocation for any of your C "class objects" in your user code using this library, since any user code including this library doesn't even know how big the struct is, so it cannot be statically allocated. Instead, the library must do dynamic memory allocation at program initialization, one time, which is safe even for embedded deterministic real-time safety-critical systems since you are not allocating or freeing memory during normal program execution.

For a detailed and full example of Option 2 (don't be confused: I call it "Option 1.5" in my answer linked-to here) see my other answer on opaque structs/pointers here: Opaque C structs: how should they be declared?.

Personally, I think the Option 1, with static memory allocation and "all public members", may be my preferred approach, but I am most familiar with the opaque struct Option 2 approach, since that's what the C code base I work in the most uses.

Bullet 3 above: including pointers to functions in your structs.

This can be done, and some do it, but I really hate it. Don't do it. It just makes your code so stinking hard to follow. In Eclipse, for instance, which has an excellent indexer, I can Ctrl + click on anything and it will jump to its definition. What if I want to see the implementation of a function I'm calling on a C "object"? I Ctrl + click it and it jumps to the declaration of the pointer to the function. But where's the function??? I don't know! It might take me 10 minutes of grepping and using find or search tools, digging all around the code base, to find the stinking function definition. Once I find it, I forget where I was, and I have to repeat it all over again for every single function, every single time I edit a library module using this approach. It's just bad. The opaque pointer approach above works fantastic instead, and the public pointer approach would be easy too.

Now, to directly answer your questions:

To make common operations on these struct instances easier for library consumers, can I provide function pointers to such functions inside the struct itself?

Yes you can, but it only makes calling something easier. Don't do it. Finding the function to look at its implementation becomes really hard.

Is it a good practice?

No, use Option 1 or Option 2 above instead, where you now just have to call C "namespaced" "methods" on every C "object". You must simply pass the "members of the C class" into the function as the first argument for every call instead. This means instead of in C++ where you can do:

myclass.dosomething(int a, int b);

You'll just have to do in object-based C:

// Notice that you must pass the "guts", or member data
// (`mylibrarymodule` here), of each C "class" into the namespaced
// "methods" to operate on said C "class object"!
// - Essentially you're passing around the guts (member variables)
//  of the C "class" (which guts are frequently referred to as
// "private data", or just `priv` in C lingo) to each function that
// needs to operate on a C object
mylibrarymodule_dosomething(mylibrarymodule_h mylibrarymodule, int a, int b);

Would there be issues with respect to multithreading where a utility function is called in parallel with different arguments and so on?

Yes, same as in any multithreaded situation where multiple threads are trying to access the same data. Just add a mutex to each C struct-based "object", and be sure each "method" acting on your C "objects" properly locks (takes) and unlocks (gives) the mutex as required before operating on any shared volatile members of the C "object".

Providing helper functions when rolling out own structures

3 Answers3

Background:

Architectural C notes:

So, regarding C architecture (ie: actual C, NOT "C+"/C-style C++):

To make object-based C architecture, here's a few things to remember:

1st option, public structs:

2nd option: opaque structs:

For a detailed and full example of Option 2 (don't be confused: I call it "Option 1.5" in my answer linked-to here) see my other answer on opaque structs/pointers here: Opaque C structs: how should they be declared?.

Bullet 3 above: including pointers to functions in your structs.

Now, to directly answer your questions:

Related:

Linked