Let's do the right thing, and use a structure to describe a dynamically allocated, grow-as-needed string:
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
struct mystring {
char *ptr; /* The actual string */
size_t len; /* The length of the string */
size_t max; /* Maximum number of chars allocated for */
};
#define MYSTRING_INIT { NULL, 0, 0 }
If we want to append something to a struct mystring
, we define a function that takes a pointer to the structure the function can modify. (If it only needed a char pointer instead of a structure, it'd take a char **
; a pointer to a char pointer.)
void mystring_append(struct mystring *ms, const char *s)
{
const size_t slen = (s) ? strlen(s) : 0;
/* Make sure ms points to a struct mystring; is not NULL */
if (!ms) {
fprintf(stderr, "mystring_append(): No struct mystring specified; ms == NULL!\n");
exit(EXIT_FAILURE);
}
/* Make sure we have enough memory allocated for the data */
if (ms->len + slen >= ms->max) {
const size_t max = ms->len + slen + 1;
char *ptr;
ptr = realloc(ms->ptr, max);
if (!ptr) {
fprintf(stderr, "mystring_append(): Out of memory!\n");
exit(EXIT_FAILURE);
}
ms->max = max;
ms->ptr = ptr;
}
/* Append. */
if (slen > 0) {
memmove(ms->ptr + ms->len, s, slen);
ms->len += slen;
}
/* We allocated one char extra for the
string-terminating nul byte, '\0'. */
ms->ptr[ms->len] = '\0';
/* Done! */
}
The (s) ? strlen(s) : 0;
expression uses the ?:
conditional operator. Essentially, if s
is non-NULL, the expression evaluates to strlen(s)
, otherwise it evaluates to 0
. You could use
size_t slen;
if (s != NULL)
slen = strlen(s);
else
slen = 0;
instead; I just like the concise const size_t slen = (s) ? strlen(s) : 0
form better. (The const
tells the compiler that the slen
variable is not going to be modified. While it might help the compiler generate better code, it is mostly a hint to other programmers that slen
will have this particular value all through this function, so they do not need to check if it might be modified somewhere. It helps code maintenance in the long term, so it is a very good habit to get into.)
Normally, functions return success or error. For ease of use, mystring_append()
does not return anything. If there is an error, it prints an error message to standard output, and stops the program.
It is a good practice to create a function that releases any dynamic memory used by such a structure. For example,
void mystring_free(struct mystring *ms)
{
if (ms) {
free(ms->ptr);
ms->ptr = NULL;
ms->len = 0;
ms->max = 0;
}
}
Often, you see initialization functions as well, like
void mystring_init(struct mystring *ms)
{
ms->ptr = NULL;
ms->len = 0;
ms->max = 0;
}
but I prefer initialization macros like MYSTRING_INIT
, defined earlier.
You can use the above in a program like this:
int main(void)
{
struct mystring message = MYSTRING_INIT;
mystring_append(&message, "Hello, ");
mystring_append(&message, "world!");
printf("message = '%s'.\n", message.ptr);
mystring_free(&message);
return EXIT_SUCCESS;
}
Notes:
When we declare a variable of the structure type (and not as a pointer to the structure, i.e. no *
), we use .
between the variable name and the field name. In main()
, we have struct mystring message;
, so we use message.ptr
to refer to the char pointer in the message
structure.
When we declare a variable as a pointer to a structure type (as in the functions, with *
before the variable name), we use ->
between the variable name and the field name. For example, in mystring_append()
we have struct mystring *ms
, so we use ms->ptr
to refer to the char pointer in the structure pointed to by the ms
variable.
Dynamic memory management is not difficult. realloc(NULL, size)
is equivalent to malloc(size)
, and free(NULL)
is safe (does nothing).
In the above function, we just need to keep track of both current length, and the number of chars allocated for the dynamic buffer pointed to by field ptr
, and remember that a string needs that terminating nul byte, '\0'
, which is not counted in its length.
The above function reallocates only just enough memory for the additional string. In practice, extra memory is often allocated, so that the number of reallocations needed is kept to a minimum. (This is because memory allocation/reallocation functions are considered expensive, or slow, compared to other operations.) That is a topic for another occasion, though.
If we want a function to be able to modify a variable (be that any type, even a structure) in the callers scope -- struct mystring message;
in main()
in the above example --, the function needs to take a pointer to variable of that type, and modify the value via the pointer.
The address-of operator, &
, takes the address of some variable. In particular, &message
in the above example evaluates to a pointer to a struct mystring
.
If we write struct mystring *ref = &message;
, with struct mystring message;
, then message
is a variable of struct mystring
type, and ref
is a pointer to message
; ref
being of struct mystring *
type.