1

This part is confusing me a bit. When i run the program an use "one two three" as input the first print of the buffer is a-ok. Once i use strtok and try to print the buffer again it only prints "one" and not "one two three". Have i changed the content of the buffer without knowing? If yes is there a way for that not to happen? Thank you for your time!

#include "stdlib.h"
#include "string.h"

int main (void )
{
   char *buffer;
    size_t buffer_size =64;
    buffer = (char *)  malloc(64 * sizeof (char ));
    getline(&buffer,&buffer_size,stdin);
    char *anotherbuffer;
    printf("%s\n",buffer);
    anotherbuffer = (char *) malloc(64 *sizeof (char ));
    anotherbuffer = strtok(buffer," ");
    anotherbuffer = strtok(NULL," ");
    printf("The buffer \"buffer\" containts %s\n",buffer);
    printf("The buffer \"anotherbuffer\" containts %s\n",anotherbuffer);
    return 0;
}```

  • OT: After the 2nd `malloc()` has returned a pointer to a 2nd buffer, this code immediately overwrites that heap address (memory leak)... – Fe2O3 Oct 19 '22 at 21:27

1 Answers1

2

If you look at the description of strtok (for example on cppreference.com) you will find that it puts a '\0' byte at the first (or on subsequent calls, next) separator that it detects. That's why when you print your buffer after the first use of strtok, the C string in that buffer has actually gotten shorter. The rest of the characters is still there, but as far as printf %s is (and other string functions like strlen are) concerned, the string is now terminated where the first separator used to be, because there is a '\0' character now.

To avoid modifying your buffer, you need to use different functions. For example, you could use strpbrk or strcspn to find the address of the first separator from one memory address, or the length up to that first separator, respectively. From there you could move forwards by using the new address as the starting point in the next calls to these functions. To extract a token into its own C string, you would have to copy it out of the original buffer into a new one, and terminate it with a \0 there.

JayK
  • 3,006
  • 1
  • 20
  • 26
  • Ohh I see! Thank you very much for the quick answer. To you knowledge is there a better way to cut a string in tidy little user defined pieces without having to go through the /0 hassle? – user3745168 Oct 19 '22 at 20:35
  • I have added a paragraph with one (or two) possibilities to navigate the original string, but basically without another dedicated library function that tokenizes a string for you without modifying the original, some of the work will be left at your hands, unfortunately. – JayK Oct 19 '22 at 20:39