1

I've been recently trying to make both a dynamic array library + a matrix library to get my head wrapped around C more and especially pointers.

Lately however I found I've been doing things in a different way than some other libraries such as gsl for example. I've been trying to make a single function/struct that can handle every type in c + user defined ones, however when I look at gsl and specifically that matrices part of it they define it in a much different way. The gsl libraries has multiple structs for varying data types (matrix_int, matrix_float, matrix_double, etc.) as well as a set of functions that would only work with that struct (matrix_int_add, etc). My question is, is there an advantage in having a function/struct for each data type? Why not just use a void pointer instead to only have one set of those structs/functions?

Raymond Chen
  • 44,448
  • 11
  • 96
  • 135
zee
  • 2,933
  • 2
  • 16
  • 28
  • Greater type safety and ease of implementation. A "type for all" is an error prone approach for both you and users of the library. – StoryTeller - Unslander Monica Mar 08 '17 at 06:58
  • @StoryTeller so if I'm trying to make my own implementation that also allows the user to put their own types in should I just use the same approach of having a specific struct/function for each type but leaving a "type for all" for the user to use if they would like to use a custom type? – zee Mar 08 '17 at 07:03
  • So what would `matrix_void_add` do? Add n voids together? Or it would have a switch-case for every data type? Even when the programmer knows what data they're using, you'd want to move this from compile time to run-time? – Antti Haapala -- Слава Україні Mar 08 '17 at 07:09
  • Yes and no, actually. You should implement it like that, but you can offer an API that wraps is all up by using a [Generic Selection](http://en.cppreference.com/w/c/language/generic) – StoryTeller - Unslander Monica Mar 08 '17 at 07:10
  • Yes, generic selection could work here; it would make the use of API easier on C11+ – Antti Haapala -- Слава Україні Mar 08 '17 at 07:11
  • I think if I'm understanding your question that I wouldn't add those functions to the void one especially since how I see it it would mostly be used as a way to hold structs etc which you can't really do math. @StoryTeller as nice as that would be I'm trying to stick to c99 for now ;) – zee Mar 08 '17 at 07:13
  • c99 is superseded and replaced by c11, as it goes. It's been six years since the recent publication and compilers have excellent support for it. There's no virtue to sticking to c99. – StoryTeller - Unslander Monica Mar 08 '17 at 07:15
  • @StoryTeller I see, I'll take that into account. I'm just generally confused as when I go on here people encourage me to use c11 but in other places (forums, chats, etc) people encourage using c99. – zee Mar 08 '17 at 07:16
  • Probably because there are opinionated people everywhere :) But if you want to be proficient in modern C programming, then that's C11, not C99 (which coincidentally was also treated with suspicion for years, when people vehemently argued that one should stick to C90) – StoryTeller - Unslander Monica Mar 08 '17 at 07:19
  • That does make sense I suppose, I'm quite new to C so I'm not really sure and have been depending on what people have told me. I guess it wouldn't kill me to use C11, thank you :) – zee Mar 08 '17 at 07:22
  • Could you post a link to the documentation for functions like this one: matrix_int_add ? – 2501 Mar 08 '17 at 08:16
  • @StoryTeller: *"Compilers have excellent support for C11"*. No, they definitely do not. There is excellent support for *parts of C11*, but no compiler I know of fully implement C11 yet. (Because there are many optional parts to C11, the definition of *"fully"* may vary.) C99 is very well supported in comparison -- and I for one would have preferred C11 to include parts of POSIX.1, rather than those oddball/weird/suspicious *"safe `_s()` function variants"* --, and the only one who treated it with suspicion was Microsoft. – Nominal Animal Mar 08 '17 at 08:54
  • @NominalAnimal - Considering quite a few optional features of C11 are mandatory features of C99 that were retconned as optional, I fail to see how a tool-chain that very well supports C99 can fail to support the very same features in C11. What makes its support less than "excellent" in comparison!? As for POSIX support... well, that's why I personally prefer to stick with POSIX systems :) – StoryTeller - Unslander Monica Mar 08 '17 at 09:17
  • @StoryTeller: You forget that there are compilers that support C11 but do not support C99, simply by omitting the support for the features mandatory in C99. There are also differences from C99+POSIX.1/GNU extensions to C11 wrt. atomics, for example. This includes the Microsoft compiler. As to POSIX: yup, me too; I do not want to be restricted to a single *vendor*. I do play with microcontrollers that are programmable via an open toolchain (esp. GCC, again mostly for `__atomic_*()` support; quite useful in interrupt handlers), too. – Nominal Animal Mar 08 '17 at 10:48
  • @NominalAnimal - Yes, well, that particular retconning was indeed done to appease certain vendors whom shall remain unnamed (*cough*)... Microsoft... (*cough*). IMO making previously mandatory features into optional ones just makes the whole discussion about which standard to adopt a moot point. – StoryTeller - Unslander Monica Mar 08 '17 at 10:58
  • @StoryTeller: Agreed. And therein we arrive at my point: I recommend C99 and not C11 precisely for those reasons. The compilers that implement C99 (i.e., all but Microsoft) have implemented it as fully as I care about; I rely on POSIX.1 and occasionally GNU/BSD extensions for the rest. I will only include C11 features as they arrive in stable form in GCC for all architectures I use, and will only recommend specific C11 features as they are implemented in several other C compilers with the same semantics (for portability). Therefore, I recommend C99 over C11. – Nominal Animal Mar 08 '17 at 11:24
  • @NominalAnimal - It's not that hard to get around the under-specification of certain features to get portable code. I recommend C11 precisely to feed the adoption-support feedback loop of reasonable vendors. If enough people don't use C11, how are vendors to know where to focus their development effort? Plus, the more use, the more defect report, the better C2x. – StoryTeller - Unslander Monica Mar 08 '17 at 11:32
  • @StoryTeller: C11 itself is proof otherwise; otherwise unambiguous POSIX.1 features (like `getline()`) would have gotten in. The C standards committee exists only to serve the big member corporations' interests; they care **nothing** about individual developers or developers' needs. Even less about true code portability (as you need POSIX.1 or similar for e.g. filesystem operations anyway). I have zero hope for a better C2x. – Nominal Animal Mar 08 '17 at 11:43
  • @NominalAnimal - The big corporations I know of don't even adopt C11 yet, so I fail to see the pandering. As nice of a chat as this was, I think we can mainly just agree to respectfully disagree on this :) – StoryTeller - Unslander Monica Mar 08 '17 at 11:46
  • @StoryTeller: I'm not "chatting" with you. I'm objecting to your *"there is no virtue in sticking to C99"* comment above, and have tried showing it is factually incorrect. Your opinion and mine are irrelevant; what matters is the advice you give to others should be based on reality and facts, and not your own personal preferences (unless clearly stated as such). – Nominal Animal Mar 08 '17 at 14:00
  • @NominalAnimal - You most certainly were "chatting" with me, at least until I refused to be swayed by your remarks. This overly pedantic and objectionable comment is not a surprise, given your "fact based" belief that the committee exists solely to pander to "Big Corporations". Feel free to go on a diatribe, nothing more for me to add. – StoryTeller - Unslander Monica Mar 08 '17 at 14:11

4 Answers4

2

Addressing the question, yes there are several advantages in implementing a function for each type, and in some cases is mandatory. Some libraries which are aimed for high performance use very specific instructions in order to handle the fetch/processing/write of the data an efficient manner based on the variable type.

A clear example of this is would be the case for float and int, even when they have the same size (for 32 bit processors), the representation is compĺetely different and the operation is handled by a different operational unit, ALU for int and FPU for floats.

Also only for a C11 compiler you could use the _Generic() but if you are using C99 there's no way to know the type of the variable. (AFAIK) _Generic() works at compile time so either you end up with a function for each type.

Community
  • 1
  • 1
GurstTavo
  • 336
  • 1
  • 3
  • 18
2

You could easily and correctly write a function matrixop(void *arg, int type_of_arg, etc) and then cast arg as necessary according to type_of_arg. Of course, as has already been said, for the function to work correctly, some operations might have to be performed differently for different type_of_arg's. But the user wouldn't have to know that, and would see just the one function. And since you say, "get my head wrapped around C more and especially pointers", I'd definitely recommend you give it a try this way. Your particular matrix example may not be the very, very best situation where void pointers are most useful, but it's definitely fine and dandy for practice.

John Forkosh
  • 502
  • 4
  • 14
1

My question is, is there an advantage in having a function/struct for each data type?

Yes. It adds compile-time type checking.

The code used to implement the same operation for different types does differ, and not necessarily by just the element type used. (For matrix operations, the optimum caching strategy may differ between integer and floating-point types of the same size, for example; especially if the hardware supports vectorization.) This means each element type requires their own version of each operation.

It is possible to use some templating techniques to generate element type specific versions of operations that only differ by the type, but usually the end result is more complicated (and thus harder to maintain) than just maintaining the slightly differing implementations separately.

It is quite possible to add an additional layer -- no modifications, just an additional header file included after GSL --, using the preprocessor and either GCC extensions (__typeof__) or C11 _Generic() to present a single "function" for each matrix operation, that chooses the function called at compile time based on the type of the parameter(s).

Why not just use a void pointer instead to only have one set of those structs/functions?

Because not only do you lose the compile-time type checking -- the user can supply say a literal string, and the compiler won't warn about it, no matter what warnings are enabled --, but it would also add run-time overhead.

Instead of choosing the proper function (implementation) to call at compile time, the data type field would have to be examined and the correct function called at run time. The generic matrix multiply function, for example, might look like

status_code_type matrix_multiply(void *dest, void *left, void *right)
{
    const element_type  tleft = ((struct generic_matrix_type *)left)->type;
    const element_type  tright = ((struct generic_matrix_type *)right)->type;

    if (tleft != tright)
        return ERROR_TYPES_MISMATCH;

    switch (tleft) {
    case ELEMENT_TYPE_INT:
        return matrix_mul_int_int(dest, left, right);
    case ELEMENT_TYPE_FLOAT:
        return matrix_mul_float_float(dest, left, right);
    case ELEMENT_TYPE_DOUBLE:
        return matrix_mul_double_double(dest, left, right);
    case ELEMENT_TYPE_COMPLEX_FLOAT:
        return matrix_mul_cfloat_cfloat(dest, left, right);
    case ELEMENT_TYPE_COMPLEX_DOUBLE:
        return matrix_mul_cdouble_cdouble(dest, left, right);
    default:
        return ERROR_UNSUPPORTED_TYPE;
    }
}

All of the above code is pure overhead, with the sole purpose of making it "slightly easier" on the programmer. The GSL developers, for example, didn't find it necessary or useful.


Quite a lot of C code -- including most C libraries' FILE implementation -- does utilize a related approach, however: the data structure itself contains function pointers for each operation the data type supports, in an object-oriented fashion.

For example, you could have

struct matrix {
    long     rows;
    long     cols;
    long     rowstep; /* Number of bytes to next row */
    long     colstep; /* Number of bytes to next element */
    size_t   size;    /* Size of each element */
    int      type;    /* Type of each element */
    char    *data;    /* Logically void*, but allows pointer arithmetic */
    int    (*supports)(int, int);
    int    (*get)(struct matrix *, long, long, int, void *);
    int    (*set)(struct matrix *, long, long, int, const void *);
    int    (*mul)(struct matrix *, long, long, int, const void *);
    int    (*div)(struct matrix *, long, long, int, const void *);
    int    (*add)(struct matrix *, long, long, int, const void *);
    int    (*sub)(struct matrix *, long, long, int, const void *);
};

where the

int supports(int source_type, int target_type);

is used to find out whether the other callbacks support the necessary operations between the two types, and the rest of the member functions,

int get(struct matrix *m, long row, long col, int to_type, void *to);
int set(struct matrix *m, long row, long col, int from_type, void *from);
int mul(struct matrix *m, long row, long col, int by_type, void *by);
int div(struct matrix *m, long row, long col, int by_type, void *by);
int add(struct matrix *m, long row, long col, int by_type, void *by);
int sub(struct matrix *m, long row, long col, int by_type, void *by);

operate on a single element of a given matrix. Note how we need to pass a reference to the matrix itself; if we call e.g. some->get(...), the function that the get function pointer points to, does not automatically get a pointer to the structure via which it was called.

Also note how the value read from the matrix (get), or otherwise used in the operation, is provided via a pointer; and the type of the data specified by the pointer is separately provided. This is needed, if you want a function that say initializes a matrix to identity to work, without the user implementing every single matrix operation function for their custom type themselves.

Because access to an element involves an indirect call, the overhead of the function pointers is quite significant -- especially if you consider how simple and fast the single-element operations actually take. (For example, a 5 clock cycle indirect call overhead on an operation that itself only takes 10 clock cycles, adds 50% overhead!)

StoryTeller - Unslander Monica
  • 165,132
  • 21
  • 377
  • 458
Nominal Animal
  • 38,216
  • 5
  • 59
  • 86
0

It depends what your functions do. If they do not actually use the data at all then void* could be correct, whereas if they do need to know anything about the data then specifying the types is the correct way to go.

For example, your dynamic array library probably does not need separate functions to add, remove, sort (etc) int and float data items to the array. In this instance the functions do not need to know anything about the type of the object being stored, just its location; in which case passing a void* is correct.

On the other hand, the matrix library may need different prototypes for the different data types, because int and float (etc) data use different instructions to manipulate them.

Evil Dog Pie
  • 2,300
  • 2
  • 23
  • 46