6

I am getting back into using C, but I've been spoiled by generics in other languages. I have made it to the following piece of code in my implementation of a resizable array:

typdef struct {
  void** array;
  int length;
  int capacity;
  size_t type_size;
} Vector;

void vector_add(Vector* v, void* entry) {
  // ... code for adding to the array and resizing
}

int main() {
  Vector* vector = vector_create(5, sizeof(int));
  vector_add(vector, 4); // This is erroneous...
  // ...
}

In my attempt to make this generic, I'm now unable to add an integer to the vector without storing it in memory somewhere else.

Is there any way to make this work (either as is, or possibly a better approach to generics)?

sdasdadas
  • 23,917
  • 20
  • 63
  • 148
  • 1
    @Jason: AFAIK, C is type-safe, except for casts and unions. He can achieve this through _evil_ use of macros. – SLaks Aug 08 '13 at 20:07
  • @Jason I've done it using macros (as SLaks alluded to), but given the choice would much rather just use C++ if I need it that badly – Dennis Meng Aug 08 '13 at 20:10
  • 1
    Your `type_size` field seems redundant, when all you can store are `void*`-sized data. You certainly can go for storing memory blocks of an arbitrary fixed and consistent size, and not store pointers. – Ben Voigt Aug 08 '13 at 20:14
  • @BenVoigt Thank you, I hadn't realized. – sdasdadas Aug 08 '13 at 20:19
  • @SLaks: Sorry, I was *way* too terse in my comment. What I meant was with the use of `void*`. – jason Aug 08 '13 at 20:24
  • @Stu Questions like these are my way of learning about the boundaries of a language. Your comment is unhelpful (in a few ways). – sdasdadas Aug 21 '13 at 19:35
  • If you're only doing this as an academic exercise, I apologize. I just had a really, really bad day debugging a DLL written by someone learning C++ along the way (and failing miserably). My first reaction was therefore "if I see this in production code I will find you and hurt you". – Stu Aug 22 '13 at 02:31

9 Answers9

7

For my answer I am assuming that you are not familiar with the sections of memory (ie the use of the memory pool).

In my attempt to make this generic, I'm now unable to add an integer to the vector without storing it in memory somewhere else.

If you want to create a generic structure (as you did) then you will need to use void pointers. Consequently, from the use of void pointers you will need to store the values for each field on the memory pool, or uncommonly on the stack. Note, the structure is composed of void pointers and hence only memory addresses are contained within the structure, pointing to other locations in memory where the values are.

Be careful if you declare them on the stack as once your stack frame is popped from the call stack those memory addresses are not considered to be valid and hence may be used by another stack frame (overwriting your existing values within that collection of memory addresses).

Aside: If you migrate to C++ then you can consider the use of C++ templates.

Jacob Pollack
  • 3,703
  • 1
  • 17
  • 39
  • I can't point to a single sentence that's wrong in isolation, but the conclusion you infer is wrong. Values need to be stored somewhere, but that doesn't have to be "somewhere else". – Ben Voigt Aug 08 '13 at 20:29
  • @BenVoigt, my response is assuming that he does not understand the sections of memory. I see your confusion though, I made it more clear. – Jacob Pollack Aug 08 '13 at 20:47
2

Yes; you can embrace Greenspun's Tenth Rule and develop a full blown dynamic language in C, and in the process, develop a relatively clean C run time that can be used from within C.

In this project I did just that, as have others before me.

In the C run time of this project, a generic number would be created from a C number like this:

val n = num(42);

because of the way val is represented, it takes up only a machine word. A few bits of type tag are used to distinguish a number from a pointer, from a character, etc.

There is also this:

val n = num_fast(42);

which is much faster (a bit manipulation macro) because it doesn't do any special checks that the number 42 fits into the "fixnum" range; it's used for small integers.

A function that adds its argument to every element of a vector could be written (very inefficiently) like this:

val vector_add(val vec, val delta)
{
   val iter;
   for (iter = zero; lt(iter, length(vec)); iter = plus(iter, one)) {
      val *pelem = vecref_l(vec, iter);
      *pelem = plus(*pelem, delta);
   }
   return nil;
}

Since plus is generic, this will work with fixnums, bignums and reals, as well as with characters, since it is possible to add integer displacements to characters via plus.

Type mismatch errors will be caught by the lower level functions and turned into exceptions. For instance if vec isn't something to which length can be applied, length will throw.

Functions with a _l suffix return a location. Wherease vecref(v, i) returns the value at offset i in vector v, vecref_l(v, i) returns a pointer to the val typed location in the vector which stores that value.

It's all C, just with the ISO C rules bent a little bit: you can't make a type like val efficiently in strictly conforming C, but you can do it quite portably to architectures and compilers you care about supporting.

Our vector_add isn't generic enough. It's possible to do better:

val sequence_add(val vec, val delta)
{
   val iter;
   for (iter = zero; lt(iter, length(vec)); iter = plus(iter, one)) {
      val elem = ref(vec, iter);
      refset(vec, iter, plus(elem, delta));
   }
   return nil;
}

By using the generic ref and refset, this now works with lists and strings also, not only vectors. We can do something like:

val str = string(L"abcd");
sequence_add(str, num(2));

The contents of str will change to cdef since a displacement of 2 is added to each character, in place.

Kaz
  • 55,781
  • 9
  • 100
  • 149
1

Your idea can be done:

int *new_int = (int*)malloc(sizeof(int));
*new_int = 4;
vector_add(vector, new_int);

Naturally, it would be a good idea to do a int *create_int(int x) function or something similar:

int *create_int(int x)
{
    int *n = (int*)malloc(sizeof(int));
    *n = 4;
    return n;
}
//...
vector_add(vector, create_int(4));

If your environment allows it you may consider using a well tested, widely used library that already manages all that, such as Glib. Or even C++.

rodrigo
  • 94,151
  • 12
  • 143
  • 190
  • 1
    [Don't cast the return value of malloc](http://stackoverflow.com/a/605858/900873). – Kevin Aug 08 '13 at 20:20
  • Thanks, but this is too much overhead (for the programmer) for storing a number. This was just a learning exercise, but it's good to butt up against the limits of a language sometimes. – sdasdadas Aug 08 '13 at 20:20
1

You can avoid having many many small allocations by storing the data instead of pointers to it, like

typedef struct {
  char* array;
  int length;
  int capacity;
  size_t type_size;
} Vector;

bool vector_add(Vector* v, void* entry)
{
    if (v->length < v->capacity || vector_expand(v)) {
        char* location = v->array + (v->length++)*(v->type_size);
        memcpy(location, entry, v->type_size);
        return 1;
    }
    return 0; // didn't fit
}

int main()
{
    Vector* vector = vector_create(5, sizeof(int));
    int value = 4;
    vector_add(vector, &value); // pointer to local is ok because the pointer isn't stored, only used for memcpy
}
Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • Note that in C we are used to the asterisk cuddling with the object, not the type: `bool vector_add(Vector *v, void *entry)`. – Jens Aug 08 '13 at 20:27
  • @Jens: I used the same convention as the question. – Ben Voigt Aug 08 '13 at 20:29
  • @BenVoigt This probably wouldn't be great for storing large objects, though? – sdasdadas Aug 08 '13 at 20:43
  • @sdasdadas: That really depends on how often the vector will have to grow. If you can estimate the size in advance, or you fill it once and then read it a lot, storing objects sequentially can have better performance than pointers, because cache is used more effectively. – Ben Voigt Aug 08 '13 at 21:00
1

Yes, here's an implementation of mine (similar to yours) that may help. It uses macros that can be wrapped with function calls for immediate values.

#ifndef VECTOR_H
# define VECTOR_H

# include <stddef.h>
# include <string.h>

# define VECTOR_HEADROOM 4

/* A simple library for dynamic
 * string/array manipulation
 *
 * Written by: Taylor Holberton
 * During: July 2013
 */

struct vector {
    void * data;
    size_t  size, len;
    size_t  headroom;
};

int vector_init (struct vector *);

size_t vector_addc  (struct vector *, int index, char c);
size_t vector_subc  (struct vector *, int index);

// these ones are just for strings (I haven't yet generalized them)
size_t vector_adds (struct vector *, int index, int iend, const char * c);
size_t vector_subs (struct vector *, int ibegin, int iend);

size_t vector_addi (struct vector *, int index, int i);
size_t vector_subi (struct vector *, int index);

# define vector_addm(v, index, datatype, element)                        \
do {                                                                    \
    if (!v) return 0;                                               \
                                                                    \
    if (!v->size){                                                  \
            v->data = calloc (v->headroom, sizeof (datatype));      \
            v->size = v->headroom;                                  \
    }                                                               \
                                                                    \
    datatype * p = v->data;                                         \
                                                                    \
    if (v->len >= (v->size - 2)){                                   \
            v->data = realloc (v->data,                             \
                    (v->size + v->headroom) * sizeof (datatype));   \
            p = v->data;                                            \
            memset (&p[v->size], 0, v->headroom * sizeof(datatype));\
            v->size += v->headroom;                                 \
    }                                                               \
                                                                    \
    if ((index < 0) || (index > v->len)){                           \
            index = v->len;                                         \
    }                                                               \
                                                                    \
    for (int i = v->len; i >= index; i--){                          \
            p[i + 1] = p[i];                                        \
    }                                                               \
                                                                    \
    p[index] = element;                                             \
                                                                    \
    v->len++;                                                       \
                                                                    \
} while (0)


# define vector_subm(v, index, datatype)                                 \
do {                                                                    \
    if (!v || !v->len){                                             \
            return 0;                                               \
    }                                                               \
                                                                    \
    if ((index < 0) || (index > (v->len - 1))){                     \
            index = v->len - 1;                                     \
    }                                                               \
                                                                    \
    datatype * p = v->data;                                         \
                                                                    \
    for (int i = index; i < v->len; i++){                           \
            p[i] = p[i + 1];                                        \
    }                                                               \
                                                                    \
    v->len--;                                                       \
                                                                    \
    if ((v->size - v->len) > v->headroom){                          \
            v->data = realloc (v->data, ((v->size - v->headroom) + 1) * sizeof (datatype));\
            v->size -= v->headroom;                                 \
    }                                                               \
                                                                    \
} while (0)

#endif

And I usually wrap them like:

size_t vector_addi (struct vector * v, int index, int i){
    vector_addm (v, index, int, i);
    return v->len;
}

I haven't had this code-reviewed, but I've been using it in a large program I'm writing and I haven't had any memory errors from them (using valgrind).

The only thing that is really missing (I've been meaning to add) the ability to add and subtract arrays from arrays.

Edit: I believe you can also do this same sort of thing with stdarg.h, but I've never tried it.

tay10r
  • 4,234
  • 2
  • 24
  • 44
  • I might be missing the point but it seems you still have different functions for adding characters versus integers. – sdasdadas Aug 08 '13 at 20:42
  • 1
    @sdasdadas the function names are different, but they call use the same macro. You can alternatively use the macro instead of the function call, but it will likely take up a little bit more memory. – tay10r Aug 08 '13 at 20:46
  • 1
    @sdasdadas you might also want to read up on `stdarg.h` because I think you can do it with that as well. It allows you to use an arbitrary number of parameters with no specific type. – tay10r Aug 08 '13 at 21:00
1

You asked for a better approach? Here ist is: https://github.com/m-e-leypold/glitzersachen-demos/tree/master/generix/v0-2011 (Disclosure: This is my code).

Let me explain very shortly:

  • I wanted type safe generic containers (which in other languages would be provided by proper generics (Ada) or parametric polymorphism (OCaml). This is the the feature that is most missing in C.

  • Macros just cannot do it (I'm not going to explain that in detail. Suffice to say: The result of a template expansion or generic instantiation should be a module in it's own right: In C this means, there are pre processor symbols exported respectively can be used for module configuration (like -DUSE_PROCESS_QUEUE_DEBUGCODE) you couldn't do that if you used C macros to generate instances.

  • I'm abstracting over element type by moving element size and all relevant operation into a descriptive structure. This will be passed to every invocation of the generic code. Note that the descriptor describes the element type, so a descriptor instance will be needed once per generic instance.

  • I'm using a template processor to create a thin type safe frontend module to the generic code.

Example:

This is the prototype for the generic code to retrieve an element:

void fifo_get ( fifo_DESCRIPTOR* inst, fifo* , void* var );

This is the descriptor type:

typedef struct fifo_DESCRIPTOR {
  size_t maxindex;
  size_t element_size;
} fifo_DESCRIPTOR;

This is the template code in the type safe wrapper template:

<<eT>>  <<>>get  ( <<T>>* f ) { 
   <<eT>> e; fifo_get( &DESCRIPTOR, (fifo*) f, (void*) &e ); return e; 
}

And this is what the template expander (instantiating an generic) produces from the template:

float   floatq_get  ( floatq* f ) { 
    float e; fifo_get( &DESCRIPTOR, (fifo*) f, (void*) &e ); return e; 
}

All this has a nice make integration, but hardly any type safety in instantiation. Every error only crops up when compiling with cc.

I cannot justify at the moment, why to stick with source text templates in C instead of migrating to C++. For me, it was just an experiment.

Regards.

M.E.L.
  • 613
  • 3
  • 8
1

This approach will probably horrify you, but it can be made to work if you don't need any type-specialized logic:

// vector.h
#ifndef VECTOR_H
#define VECTOR_H

#define VECTOR_IMP(itemType) \
   typedef struct {          \
      itemType * array;      \
      int length;            \
      int capacity;          \
   } itemType##_Vector;      \
                             \
   static inline void itemType##_vector_add(itemType##_Vector* v, itemType v) { \
      // implementation of adding an itemType object to the array goes here     \
   }                                                                            \
                                                                                \
   [... other static-inline generic vector methods would go here ...]           \

// Now we can "instantiate" versions of the Vector struct and methods for
// whatever types we want to use.
VECTOR_IMP(int);
VECTOR_IMP(float);
VECTOR_IMP(char);

#endif

... and some example calling code:

#include "vector.h"

int main(int argc, char ** argv)
{
   float_Vector fv = {0};
   int_Vector iv = {0};
   char_Vector cv = {0};

   int_vector_add(&iv, 5);
   float_vector_add(&fv, 3.14f);
   char_vector_add(&cv, 'A');

   return 0;
}
Jeremy Friesner
  • 70,199
  • 15
  • 131
  • 234
1

Using some non-standard GNU C extensions, it is possible to define generic functions with inferred parameter types. This macro defines a nested function in a statement expression and infers the parameter type using typeof:

#include <stdio.h>

#define fib(n1) ({\
        typeof(n1) func(typeof(n1) n){\
            if (n <= 1)\
              return n;\
            return func(n-1) + func(n-2);\
        }\
        func(n1);\
    })

int main()
{
    printf("%d\n",fib(3));
    printf("%f\n",fib(3.0));
    return 0;
}
Anderson Green
  • 30,230
  • 67
  • 195
  • 328
0

Instead of having the vector class store the added object, you could just return a pointer to the location where the caller can store it:

typdef struct {
    char *buffer;
    size_t length;
    size_t capacity;
    size_t type_size;
} Vector;

void *vector_add(Vector* v)
{
    if (v->length == v->capacity) {
        // ... increase capacity by at least one
        // ... realloc buffer to capacity * type_size
    }
    return v->buffer + v->type_size * v->length++;
}

// in main:
*(int*)vector_add(v) = 4;
dpi
  • 1,919
  • 17
  • 17