An approach to split a string of unknown number of words and make them available in return from a function would require a function that returns a pointer-to-pointer-to-char. This allows a true dynamic approach where you allocate some initial number of pointers (say 2, 4, 8
, etc..) make a single pass through your string using strtok
keeping track of the number of pointers used, allocating storage fro each token (word) as you go and when the number of pointers used equals the number allocated, you simply realloc
storage for additional pointers and keep going.
A short example implementing the function splitstring()
that does that could look similar to the following:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define NPTR 8 /* initial number of pointers to allocate */
#define MAXD 32 /* maximum no chars for delimiter */
#define MAXC 1024 /* maximum no chars for user input */
char **splitstring (const char *str, const char *delim, size_t *nwords)
{
size_t nptr = NPTR, /* initial pointers */
slen = strlen (str); /* length of str */
char **strings = malloc (nptr * sizeof *strings), /* alloc pointers */
*cpy = malloc (slen + 1), /* alloc for copy of str */
*p = cpy; /* pointer to cpy */
*nwords = 0; /* zero nwords */
if (!strings) { /* validate allocation of strings */
perror ("malloc-strings");
free (cpy);
return NULL;
}
if (!cpy) { /* validate allocation of cpy */
perror ("malloc-cpy");
free (strings);
return NULL;
}
memcpy (cpy, str, slen + 1); /* copy str to cpy */
/* split cpy into tokens */
for (p = strtok (p, delim); p; p = strtok (NULL, delim)) {
size_t len; /* length of token */
if (*nwords == nptr) { /* all pointers used/realloc needed? */
void *tmp = realloc (strings, 2 * nptr * sizeof *strings);
if (!tmp) { /* validate reallocation */
perror ("realloc-strings");
if (*nwords) /* if words stored, return strings */
return strings;
else { /* no words, free pointers, return NULL */
free (strings);
return NULL;
}
}
strings = tmp; /* assign new block to strings */
nptr *= 2; /* update number of allocate pointers */
}
len = strlen (p); /* get token length */
strings[*nwords] = malloc (len + 1); /* allocate storage */
if (!strings[*nwords]) { /* validate allocation */
perror ("malloc-strings[*nwords]");
break;
}
memcpy (strings[(*nwords)++], p, len + 1); /* copy to strings */
}
free (cpy); /* free storage of cpy of str */
if (*nwords) /* if words found */
return strings;
free (strings); /* no strings found, free pointers */
return NULL;
}
int main (void) {
char **strings = NULL,
string[MAXC],
delim[MAXD];
size_t nwords = 0;
fputs ("enter string : ", stdout);
if (!fgets (string, MAXC, stdin)) {
fputs ("(user canceled input)\n", stderr);
return 1;
}
fputs ("enter delimiters: ", stdout);
if (!fgets (delim, MAXD, stdin)) {
fputs ("(user canceled input)\n", stderr);
return 1;
}
if ((strings = splitstring (string, delim, &nwords))) {
for (size_t i = 0; i < nwords; i++) {
printf (" word[%2zu]: %s\n", i, strings[i]);
free (strings[i]);
}
free (strings);
}
else
fputs ("error: no delimiter found\n", stderr);
}
(note: the word count nwords
is passed as a pointer to the splitstring()
function to allow the number of words to be updated within the function and made available back in the calling function, while returning a pointer-to-pointer-to-char from the function itself)
Example Use/Output
$ ./bin/stringsplitdelim
enter string : my dog has fleas and my cat has none and snakes don't have fleas
enter delimiters:
word[ 0]: my
word[ 1]: dog
word[ 2]: has
word[ 3]: fleas
word[ 4]: and
word[ 5]: my
word[ 6]: cat
word[ 7]: has
word[ 8]: none
word[ 9]: and
word[10]: snakes
word[11]: don't
word[12]: have
word[13]: fleas
(note: a ' '
(space) was entered as the delimiter above resulting in delim
containing " \n"
(exactly what you want) by virtue of having used the line-oriented input function fgets
for user input)
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind
is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/stringsplitdelim
==12635== Memcheck, a memory error detector
==12635== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==12635== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==12635== Command: ./bin/stringsplitdelim
==12635==
enter string : my dog has fleas and my cat has none and snakes don't have fleas
enter delimiters:
word[ 0]: my
word[ 1]: dog
word[ 2]: has
word[ 3]: fleas
word[ 4]: and
word[ 5]: my
word[ 6]: cat
word[ 7]: has
word[ 8]: none
word[ 9]: and
word[10]: snakes
word[11]: don't
word[12]: have
word[13]: fleas
==12635==
==12635== HEAP SUMMARY:
==12635== in use at exit: 0 bytes in 0 blocks
==12635== total heap usage: 17 allocs, 17 frees, 323 bytes allocated
==12635==
==12635== All heap blocks were freed -- no leaks are possible
==12635==
==12635== For counts of detected and suppressed errors, rerun with: -v
==12635== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have further questions.