When you use dynamically allocated memory, and adjust its size, it is important to keep track of exactly how many elements you have allocated memory for.
I personally like to keep the number of elements in use in variable named used
, and the number of elements I have allocated memory for in size
. For example, I might create a structure for describing one-dimensional arrays of doubles:
typedef struct {
size_t size; /* Number of doubles allocated for */
size_t used; /* Number of doubles in use */
double *data; /* Dynamically allocated array */
} double_array;
#define DOUBLE_ARRAY_INIT { 0, 0, NULL }
I like to explicitly initialize my dynamically allocated memory pointers to NULL
, and their respective sizes to zero, so that I only need to use realloc()
. This works, because realloc(NULL, size)
is exactly equivalent to malloc(NULL)
. I also often utilize the fact that free(NULL)
is safe, and does nothing.
I would probably write a couple of helper functions. Perhaps a function that ensures there is room for at_least
entries in the array:
void double_array_resize(double_array *ref, size_t at_least)
{
if (ref->size < at_least) {
void *temp;
temp = realloc(ref->data, at_least * sizeof ref->data[0]);
if (!temp) {
fprintf(stderr, "double_array_resize(): Out of memory (%zu doubles).\n", at_least);
exit(EXIT_FAILURE);
}
ref->data = temp;
ref->size = at_least;
}
/* We could also shrink the array if
at_least < ref->size, but usually
this is not needed/useful/desirable. */
}
I would definitely write a helper function that not only frees the memory used, but also updates the fields to reflect that, so that it is completely safe to call double_array_resize()
after freeing:
void double_array_free(double_array *ref)
{
if (ref) {
free(ref->data);
ref->size = 0;
ref->used = 0;
ref->data = NULL;
}
}
Here is how a program might use the above.
int main(void)
{
double_array stuff = DOUBLE_ARRAY_INIT;
/* ... Code and variables omitted ... */
if (some_condition) {
double_array_resize(&stuff, 321);
/* stuff.data[0] through stuff.data[320]
are now accessible (dynamically allocated) */
}
/* ... Code and variables omitted ... */
if (weird_condition) {
/* For some reason, we want to discard the
possibly dynamically allocated buffer */
double_array_free(&stuff);
}
/* ... Code and variables omitted ... */
if (other_condition) {
double_array_resize(&stuff, 48361242);
/* stuff.data[0] through stuff.data[48361241]
are now accessible. */
}
double_array_free(&stuff);
return EXIT_SUCCESS;
}
If I wanted to use the double_array
as a stack, I might do
void double_array_clear(double_array *ref)
{
if (ref)
ref->used = 0;
}
void double_array_push(double_array *ref, const double val)
{
if (ref->used >= ref->size) {
/* Allocate, say, room for 100 more! */
double_array_resize(ref, ref->used + 100);
}
ref->data[ref->used++] = val;
}
double double_array_pop(double_array *ref, const double errorval)
{
if (ref->used > 0)
return ref->data[--ref->used];
else
return errorval; /* Stack was empty! */
}
The above double_array_push()
reallocates for 100 more doubles, whenever the array runs out of room. However, if you pushed millions of doubles, this would mean tens of thousands of realloc()
calls, which is usually considered wasteful. Instead, we usually apply a reallocation policy, that grows the size proportionally to the existing size.
My preferred policy is something like (pseudocode)
If (elements in use) < LIMIT_1 Then
Resize to LIMIT_1
Else If (elements in use) < LIMIT_2 Then
Resize to (elements in use) * FACTOR
Else
Resize to (elements in use) + LIMIT_2
End If
The LIMIT_1
is typically a small number, the minimum size ever allocated. LIMIT_2
is typically a large number, something like 220 (two million plus change), so that at most LIMIT_2
unused elements are ever allocated. FACTOR
is between 1 and 2; many suggest 2
, but I prefer 3/2
.
The goal of the policy is to keep the number of realloc()
calls at an acceptable (unnoticeable) level, while keeping the amount of allocated but unused memory low.
The final note is that you should only try to keep around a dynamically allocated buffer, if you reuse it for the same (or very similar) purpose. If you need an array of a different type, and don't need the earlier one, just free()
the earlier one, and malloc()
a new one (or let realloc()
in the helpers do it). The C library will try to reuse the same memory anyway.
On current desktop machines, something like a hundred or a thousand malloc()
or realloc()
calls is probably unnoticeable compared to the start-up time of the program. So, it is not that important to minimize the number of those calls. What you want to do, is keep your code easily maintained and adapted, so logical reuse and variable and type names are important.
The most typical case where I reuse a buffer, is when I read text input line by line. I use the POSIX.1 getline()
function to do so:
char *line = NULL;
size_t size = 0;
ssize_t len; /* Not 'used' in this particular case! :) */
while (1) {
len = getline(&line, &size, stdin);
if (len < 1)
break;
/* Have 'len' chars in 'line'; may contain '\0'! */
}
if (ferror(stdin)) {
fprintf(stderr, "Error reading standard input!\n");
exit(EXIT_FAILURE);
}
/* Since the line buffer is no longer needed, free it. */
free(line);
line = NULL;
size = 0;