0

I'm trying to create a function where parameters are passed as void pointers, and including a parameter setting the data type the void pointers will be cast to, so that the function may be used on different types. Something like the following, which does not work:

void test_function(int use_type, void * value, void * array) {
    // Set types to the parameters based on 'use_type'
    if (use_type == 0) { // Int type
        int * valueT = (int *) value;
        int * arrayT = (int *) array;
    } else if (use_type == 1) { // Double type
        double * valueT = (double *) value;
        double * arrayT = (double *) array;
    }
    // Main code of the program, setting an array item, regardless of type
    arrayT[0] = *valueT;
}

There are two problems with the above code: the properly typed valueT and arrayT are scoped in the conditional blocks and not visible to the main part of the code. Moving their declarations out of the blocks isn't viable in the given structure of the code though, as they would then need different names for int and double versions, defeating the whole idea of what I'm trying to achieve. The other problem is that valueT and arrayT are local to the function. What I really want is to set the parameter array: array[0] = *value.

It appears that what I'm trying to do isn't possible in C... Is there a way that this could be done?

EDIT:

The assignment to array line is there to demonstrate what I want to do, there is a lot more code in that part. There will also be a number of other types besides int and double. Moving the assignment line into the blocks would mean too much code duplication.

Theo d'Or
  • 783
  • 1
  • 4
  • 17
  • 3
    Why not assign to the array within the block where the value to assign is defined? – Scott Hunter Jan 21 '20 at 19:49
  • 5
    What exactly are you trying to do? Depending on a usecase you can have different solutions. One idiomatic way is to use `union` of different types wrapped in `struct` containing the "type". – Eugene Sh. Jan 21 '20 at 19:50
  • @ScottHunter Please see the edit to the question, where I make clear that moving the main part of the code into the blocks would mean too much code duplication. – Theo d'Or Jan 21 '20 at 19:53
  • @EugeneSh. I want to be able to manipulate arrays in a 'type-neutral' manner. I thought it might be possible with void pointer parameters. Using unions and structs isn't viable for this as that would mean 'wrapping' the original values. – Theo d'Or Jan 21 '20 at 19:55
  • Could you use an array of void pointers instead? – Schwern Jan 21 '20 at 19:59
  • @Schwern Unfortunately not, the arrays will contain values directly. – Theo d'Or Jan 21 '20 at 20:01
  • 2
    If you insist on using the function and you want to avoid code duplication, ditch the `int use_type` for a `size_t type_size` (or `int type_size`) and then just have `{ memcpy(array, value, type_size); }` as the function body. – Petr Skocik Jan 21 '20 at 20:06
  • 1
    @Theod'Or Is your intention to have an void array which contains just integers or just doubles? Or are you trying to mix them? It seems like a bad idea to circumvent the type system like this. [Maybe you can tell us what the problem is which led you to attempt this solution](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem)? – Schwern Jan 21 '20 at 20:06
  • When you call `test_function()`, are you expecting the `use_type` parameter to typically be known at compile-time, or only at runtime? – Patrick Roberts Jan 21 '20 at 20:06
  • @Schwern The arrays would have unique types, one array ints, anther doubles, etc., so not mixing them. – Theo d'Or Jan 21 '20 at 20:08
  • @PatrickRoberts `use_type` would be known at runtime. – Theo d'Or Jan 21 '20 at 20:09
  • @Schwern "The problem that led me to attempt a solution" is that I have array-manipulation functionality to code that works the same on arrays of different types. I'm trying to create a function that can perform the work on various array types, without code duplication, without a function for each type. – Theo d'Or Jan 21 '20 at 20:12
  • 2
    It's difficult to fully answer this question without knowing everything you intend to do in the "main code of the program". Obviously your question has already been answered for the trivial case of copying the value into the array type-agnostically, but we can't give you a real answer until you clarify what specific operations are needed. Is the `memcpy()` approach that @PSkocik suggested not sufficient for your use-case? – Patrick Roberts Jan 21 '20 at 20:14
  • @Theod'Or Can you change them to take a `union` type? – Schwern Jan 21 '20 at 20:17
  • @PatrickRoberts As it happens, `memmove` will indeed be used in the main block of code, but that's just one small part of it. I can't post the whole code, it's too much. – Theo d'Or Jan 21 '20 at 20:17
  • @Theod'Or You can mess with an array completely type-agnostically with `memcpy`/`memmove`--you just need the size of the type. It will only be efficient if the compiler can, due to having inlined the function, see the size arguments (and, on some architectures, if it can infer alignments), but it will be generic even if it can't. – Petr Skocik Jan 21 '20 at 20:18
  • @Schwern Can't use `union` or `struct` wrapping. – Theo d'Or Jan 21 '20 at 20:18
  • 3
    If you can't post the code that is at least relevant to the constraints of any solution, then you can hardly expect an answer that solves your problem(s). We can only solve the problem you have asked about - and that is done. Regarding a `union`, "can't", won't or don't know how? If there are constraints on any solution, then that should be in the question. Preferably with a justification for the constraint, since it may be your lack of knowledge that makes you think you can't, or it may be a genuine technical issue that we can accept. – Clifford Jan 21 '20 at 20:42

4 Answers4

4

You're trying to implement polymorphism in C. Down this path lies madness, unmaintainable code, and new programming languages.

Instead, I strongly recommend refactoring your code to use a better method of working with mixed data. union or struct or pointers or any of the solutions here. This will be less work in the long run and result in faster and more maintainable code.

Or you can switch to C++ and use templates.

Or you can use somebody else's implementation like GLib's GArray. This is a system of clever macros and functions to allow easy access to any type of data in an array. It's Open Source so you can examine its implementation, a mix of macros and clever functions. It has many features like automatic resizing and garbage collection. And it is very mature and well tested.

A GArray remembers its type, so it isn't necessary to keep telling it.

    GArray *ints = g_array_new(FALSE, FALSE, sizeof(int));
    GArray *doubles = g_array_new(FALSE, FALSE, sizeof(double));

    int val1 = 23;
    double val2 = 42.23;

    g_array_append_val(ints, val1);
    g_array_append_val(doubles, val2);

The underlying plain C array can be accessed as the data field of the GArray struct. It's typed gchar * so it must be recast.

    double *doubles_array = (double *)doubles->data;
    printf("%f", doubles_array[0]);

If we continue down your path, the uncertainty about the type infects every "generic" function and you wind up writing parallel implementations anyway.

For example, let's write a function that adds two indexes together. Something which should be simple.

First, let's do it conventionally.

int add_int(int *array, size_t idx1, size_t idx2) {
    return array[idx1] + array[idx2];
}

double add_double(double *array, size_t idx1, size_t idx2) {
    return array[idx1] + array[idx2];
}

int main() {
    int ints[] = {5, 10, 15, 20};

    int value = add_int(ints, 1, 2);

    printf("%d\n", value);
}

Taking advantage of token concatenation, we can put a clever macro in front of that to choose the correct function for us.

#define add(a, t, i1, i2) (add_ ## t(a, i1, i2))

int main() {
    int ints[] = {5, 10, 15, 20};

    int value = add(ints, int, 1, 2);

    printf("%d\n", value);
}

The macro is clever, but probably not worth the extra complexity. So long as you're consistent about the naming the programmer can choose between the _int and _double form themselves. But it's there if you like.


Now let's see it with "one" function.

// Using an enum gives us some type safety and code clarity.
enum Types { _int, _double };

void *add(void * array, enum Types type, size_t idx1, size_t idx2) {
    // Using an enum on a switch, with -Wswitch, will warn us if we miss a type.
    switch(type) {
        case _int : {
            int *sum = malloc(sizeof(int));
            *sum = (int *){array}[idx1] + (int *){array}[idx2];
            return sum;
        };
        case _double : {
            double *sum = malloc(sizeof(double));
            *sum = (double *){array}[idx1] + (double *){array}[idx2];
            return sum;
        };
    }; 
}

int main() {
    int ints[] = {5, 10, 15, 20};

    int value = *(int *)add((void *)ints, _int, 1, 2);

    printf("%d\n", value);
}

Here we see the infection. We need a return value, but we don't know the type, so we have to return a void pointer. That means we need to allocate memory of the correct type. And we need to access the array with the correct type, more redundancy, more typecasting. And then the caller has to mess with a bunch of typecasting.

What a mess.

We can clean up some of the redundancy with macros.

#define get_idx(a,t,i) ((t *){a}[i])
#define make_var(t) ((t *)malloc(sizeof(t)))

void *add(void * array, enum Types type, size_t idx1, size_t idx2) {
    switch(type) {
        case _int : {
            int *sum = make_var(int);
            *sum = get_idx(array, int, idx1) + get_idx(array, int, idx2);
            return sum;
        };
        case _double : {
            double *sum = make_var(double);
            *sum = get_idx(array, double, idx1) + get_idx(array, double, idx2);
            return sum;
        };
    }; 
}

You can probably reduce the redundancy with even more macros, like Patrick's answer, but boy is this rapidly turning into macro hell. At a certain point you're no longer coding in C as you are rapidly expanding custom language implemented with stacks of macros.

Clifford's very clever idea of using sizes rather than types will not work here. In order to actually do anything with the values we need to know their types.


Once again, I cannot express strongly enough how big of a tar pit polymorphism in C is.

Schwern
  • 153,029
  • 25
  • 195
  • 336
  • Thank you for the extended analysis, it is very useful. I've read up on C++ templates, and they too result in code duplication, although at compiled object level if not source. Given the choice between C++ templates and C macros, I choose macros. – Theo d'Or Jan 22 '20 at 08:31
  • 1
    @Theod'Or Macros are literally the compiler cutting and pasting code for you; run `cc -E` to see the result of macro expansion. If you want code that works with multiple types, you're going to wind up with duplication at some level. Templates have the advantage of being standardized. They're well tested. Compilers will optimize them better. Other programmers will understand them better. They'll result in easier to maintain and smaller code. – Schwern Jan 22 '20 at 17:21
3

Instead of passing a type identifier, it is sufficient and simpler to pass the size of the object:

void test_function( size_t sizeof_type, void* value, void* array ) 
{
    size_t element_index = 0 ; // for example

    memcpy( (char*)array + element_index * sizeof_type, value, sizeof_type ) ; 
}
Clifford
  • 88,407
  • 13
  • 85
  • 165
  • This is how [GArray](https://developer.gnome.org/glib/2.62/glib-Arrays.html), it stores the size, not the type. However, passing in the size doesn't let you use the resulting values for anything but assigning them back to the array. – Schwern Jan 21 '20 at 21:45
  • @Schwern Sure, but that is all that is required by the pseudocode in the question. As I have already commented in the question, it is clearly not possible to solve problems not presented, and if that is required the question needs clarification in the question itself, not in the comments. – Clifford Jan 21 '20 at 22:16
1

The direct answer to your question is do the assignment dereferencing in the block in which the pointers are valid:

void test_function(int use_type, void * value, void * array) {
    // Set types to the parameters based on 'use_type'
    if (use_type == 0) { // Int type
        int * valueT = value, *arrayT = array; //the casts in C are unnecessary
        arrayT[0] = *valueT;
    } else if (use_type == 1) { // Double type
        double * valueT = value, *arrayT = array;
        arrayT[0] = *valueT;
    }
}

but you should probably be doing this inline, without any type<->int translation:

(type*){array}[0] = *(type*){value} //could make it DRY with a macro
Petr Skocik
  • 58,047
  • 6
  • 95
  • 142
  • Thanks, but it looks like you missed my edit to the question - there will be too much code and too many types to make duplicating of the main part into the conditional blocks viable. – Theo d'Or Jan 21 '20 at 20:06
  • 2
    Lines like `arrayT[0] = *valueT;` will result in completely different code generation depending on the type of `arrayT` and `valueT`. There is no opportunity for code consolidation as written. You could try to abstract it out by doing manual pointer arithmetic, as others have noted. – Raymond Chen Jan 21 '20 at 22:00
1

In order to remain type-agnostic and maintain the flexibility of usage you appear to want, you'll need move your "main code" into a macro and call it for each case:

typedef enum {
    USE_TYPE_INT = 0,
    USE_TYPE_DOUBLE = 1,
    // ...
} USE_TYPE;

void test_function(USE_TYPE use_type, void * value, void * array) {

#define TEST_FUNCTION_T(type) do { \
    type * valueT = value;         \
    type * arrayT = array;         \
    /* Main code of the program */ \
    arrayT[0] = *valueT;           \
    /* ... */                      \
} while(0)

    // Set types to the parameters based on 'use_type'
    switch (use_type) {
        case USE_TYPE_INT:
            TEST_FUNCTION_T(int);
            break;
        case USE_TYPE_DOUBLE:
            TEST_FUNCTION_T(double);
            break;
        // ...
    }

#undef TEST_FUNCTION_T

}

Note that, while you only define the TEST_FUNCTION_T macro once, each usage will result in a duplicate code block differing only by the type pasted into the macro call when the program is compiled.

Patrick Roberts
  • 49,224
  • 10
  • 102
  • 153
  • I was hoping there was a more elegant solution, but there isn't, as code duplication is unavoidable. Turning the main code block into a macro isn't so bad really, it just means some extra care and an additional line ending of ` \\`. Thank you! – Theo d'Or Jan 22 '20 at 08:28