0

I was stuck on strtok() why it modifies the original string... But then someone told strtok_r() won't do that... While I test it out I find no change. strtok_r() also modifies the original string. At the end of the call to strtok_r() original string has changed.

What is the use of variable - char* rest = str

Here is my code.

#include <stdio.h>
#include <string.h>

int main(void) {

        char str[] = "Tom Jerry Hary Potter";
        char* token;
        char* rest = str;

        printf("\n\nOriginal String before while loop: %s", str);
        printf("\n\nOriginal String length before while loop: %lu\n\n", strlen(str));

        while ((token = strtok_r(rest, " ", &rest)))
                printf(" %s :", token);

        printf("\n\nOriginal String after while loop: %s", str);
        printf("\n\nOriginal String length after while loop: %lu\n\n", strlen(str));

        return (0);
}

Output

Original String before while loop: Tom Jerry Hary Potter

Original String length before while loop: 21

 Tom : Jerry : Hary : Potter :

Original String after while loop: Tom

Original String length after while loop: 3
Aniket Tiratkar
  • 798
  • 6
  • 16
  • 2
    It seams that someone is wrong. Read this: https://linux.die.net/man/3/strtok_r – Jabberwocky Nov 13 '20 at 12:46
  • 2
    https://stackoverflow.com/questions/15961253/c-correct-usage-of-strtok-r – Retired Ninja Nov 13 '20 at 12:48
  • 3
    It's *purpose* is to modify the original string. It differs by being thread-safe: ie the caller stores the current state. – Weather Vane Nov 13 '20 at 12:50
  • ... and `char* rest` is used to store the state between successive calls. It is unnecessary to initialise it to `str` because *"On the first call to `strtok_r()`, `str` should point to the string to be parsed, and **the value of `saveptr` is ignored**. "* – Weather Vane Nov 13 '20 at 13:03
  • 1
    Does this answer your question? [What's the difference between strtok and strtok\_r in C?](https://stackoverflow.com/questions/22210546/whats-the-difference-between-strtok-and-strtok-r-in-c) – anastaciu Nov 13 '20 at 13:06
  • @anastaciu - It does also what Weather Vane said also make sense. Just to understand, why the original string has the first word only even after succssive calls to strtok_r()?? Is it because insertion of '\0' by strtok_r() at each occurrence of delimeter ?? Original string is of no use after the call. Can we restore the original string ?? – Niranjan Das Nov 13 '20 at 13:19
  • 1
    @NiranjanDas the difference is the reeentrance and reusability, strtok_r is safe to use in multithreaded operations, it can be reused in differente strings at the same time, as opposed to strtok, it is therefore safer and more usable, but there is no spec that stops it from changing the original string, you can solve this problem by copying the string and tokenizing the copy. – anastaciu Nov 13 '20 at 14:02

1 Answers1

0

was stuck on strtok() why it modifies the original string... But then someone told strtok_r() won't do that... While I test it out I find no change. strtok_r() also modifies the original string. At the end of the call to strtok_r() original string has changed.

This is not true. strtok_r(3) does change the original string, as strtok(3) did previously, what it does differently is to accept a pointer to char * from you, to store the information (the point it was when it found the last string) it needs to use between calls, so it can be reentered from different threads making the function reentrant.

Strtok is an archaic function, that predates from the times of early C. It modifies the original string to avoid using dynamically allocated memory, which is faster, but if you want to preserve the original string you can always call strdup(3) on the original string, operate on the obtained copy, and then return the memory used with free(3) when you are finished.

Anyway, even if you use strtok(2) in a locally memory allocated buffer, it continues to be non-reentrant, as it uses global static data to remember the position on the string it is parsing, so it can continue where it stopped to get the next substring. If you want to use strtok(3) nested in another loop, or you want it to be reentered recursively in a recursive procedure, then you have to switch to the _r suffixed one, or you'll lost the pointer in the outer call with the calls to the inner loop strtok(). For example:

char buffer[] = "root:0:0:x:System Administrator,Administrators Dept.,5742,+300 40 86 26:/root:/bin/sh";
/* look for password fields */
for(p = strtok(string, ":"); p; p = strtok(NULL, ":")) {
    /* separate items with commas */
    for (q = strtok(p, ","); q; q = strtok(NULL, ",")) {
        .... /* the last call to strtok(NULL, ",") broke the point where 
              * the first call to strtok(NULL, ":") was looking at, in the
              * outer loop. */

you need to change to:

char *p1;
for(p = strtok_r(string, ":", &p1); p; p = strtok_r(NULL, ":", &p1)) {
    char *p2;
    for (q = strtok_r(p, ",", &p2); q; q = strtok_r(NULL, ",", &p2)) {
        .... /* nothing breaks, if p1 and p2 are different variables. */

an example of using strdup() could be:

char *p1, *working_string = strdup(original_string);
for(p = strtok_r(workin_string, ":", &p1); p; p = strtok_r(NULL, ":", &p1)) {
    char *p2;
    for (q = strtok_r(p, ",", &p2); q; q = strtok_r(NULL, ",", &p2)) {
        .... /* nothing breaks, if p1 and p2 are different variables. */
    }
}
...
free(working_string); /* or whenever you feel it can be freed */
Luis Colorado
  • 10,974
  • 1
  • 16
  • 31