1

I was writing code to reinforce my knowledge, I got segmentation fault. So, I also got that I have to restock(completing imperfect knowledge) on my knowledge. The problem is about strtok(). When I run the first code there is no problem, but in second, I get segmantation fault. What is my "imperfect knowledge" ? Thank you for your appreciated answers.

First code

#include <stdio.h>
#include <string.h>

int main() {



    char str[] = "team_name=fenerbahce";
    char *token;

    token = strtok(str,"=");
    while(token != NULL)
    {
        printf("%s\n",token);
        token = strtok(NULL,"=");
    }
  return 0;
}

Second code

#include <stdio.h>
#include <string.h>

int main() {



    char *str= "team_name=fenerbahce";
    char *token;

    token = strtok(str,"=");
    while(token != NULL)
    {
        printf("%s\n",token);
        token = strtok(NULL,"=");
    }
  return 0;
}

3 Answers3

3

From strtok -

This function is destructive: it writes the '\0' characters in the elements of the string str. In particular, a string literal cannot be used as the first argument of strtok.

And in the second case, str is a string literal which resides in read only memory. Any attempt to modify string literals lead to undefined behavior.

Mahesh
  • 34,573
  • 20
  • 89
  • 115
  • 1
    Yes, this only compiles due to a (deprecated) conversion to `char*`. You should always use `char const*` instead, as that would have protected you from calling `strtok()` on that string. – Ulrich Eckhardt Dec 27 '14 at 17:52
  • 2
    `str = "some string";` doesn't create a `char const *`, but rather a `const char *` – Elias Van Ootegem Dec 27 '14 at 18:02
  • @UlrichEckhardt: That conversion is not deprecated in C. – Keith Thompson Dec 28 '14 at 00:01
  • 1
    @Elias Van Ootegem As if there is any difference between those two. – AnArrayOfFunctions Dec 28 '14 at 00:17
  • I didn't actually research whether the conversion is formally deprecated by the C standard, @KeithThompson, I take your work for it that it isn't. It is an ugly wart and a never-ending source of errors for programmers though, which was the main thing I wanted to express. – Ulrich Eckhardt Dec 28 '14 at 10:50
  • 2
    Both `const char*` and `char const*` are the same, both are pointers to const char, @Mahesh. A constant pointer to char would be a `char* const`. The general rule is that the const applies to the thing left from it, unless it is the leftmost, then it applies to the right. Since I don't like the latter exception, I prefer `char const*` for a pointer to const. – Ulrich Eckhardt Dec 28 '14 at 10:53
  • Yeah. I should have paid more attention to clock wise spiral rule. Thanks. – Mahesh Dec 28 '14 at 15:23
  • @UlrichEckhardt: In C, string literals are of type `char[N]`, which decays to `char*`. There is no conversion (strictly speaking there's a trivial conversion from `char*` to `char*`), so there's nothing to deprecate. In C++, string literals are `const`, so `char *s = "hello";` converts from `const char*` to `char*`; that conversion is deprecated. – Keith Thompson Dec 28 '14 at 20:42
  • String literals do not *necessarily* reside in read-only memory. – Keith Thompson Dec 28 '14 at 20:43
  • String literals are of type `char const[N]`, which in this very special case decays to `char*`. This conversion is dangerous and IMHO should be deprecated. – Ulrich Eckhardt Dec 28 '14 at 23:44
  • @UlrichEckhardt: The question is tagged C, not C++. In C, string literals are of type `char[N]`, *not* `const char[N]` (for historic reasons). I agree that assigning the `char*` that results from the conversion of a string literal to a `char*` object is dangerous. And if you want me to see a comment remember to include my name preceded by an `@` sign. – Keith Thompson Dec 29 '14 at 00:47
1
char *str= "team_name=fenerbahce";
char str[]= "team_name=fenerbahce";

The "imperfect" knowledge is about the difference between arrays and pointers! It's about the memory you cannot modify when you create a string using a pointer. When you create a string you allocate some memory that will store those values (the characters of the string). In the next lines I will refer to this when I'll talk about the "memory allocated at the start".

When you create a string using an array you will create an array that will contain the same characters as the ones of the string. So you will allocate more memory.

When you create a string using a pointer you will point to the address of memory that contains that string (the one allocated at the start).

You have to assume that the memory created at the start is not writable (that's why you'll have undefined behavior, which means segmentation fault most of the times so don't do it). Instead, when you create the array, that memory will be writable! That's why you can modify with a command like strtok only in this case

Mark
  • 405
  • 4
  • 10
1

You see string literals are the strings you write in "". For every such string, no-matter where it is used, automatically a global space is alloacted to store it. When you assign it to an array - you copy it's content into a new memory, that of the array. Otherwise you just store a pointer to it's global memory storage.

So this:

int main()
{
    const char *str= "team_name=fenerbahce";
}

Is equal to:

const char __unnamed_string[] { 't', 'e', /*...*/, '\0' };

int main()
{
   const char *str= __unnamed_string;
}

And when assigning the string to array, like this:

int main()
{
    char str[] = "team_name=fenerbahce";
}

To this:

const char __unnamed_string[] { 't', 'e', /*...*/, '\0' };
    
int main()
{
       char str[sizeof(__unnamed_string) / sizeof(char)];
       
       for(size_t i(0); i < sizeof(__unnamed_string) / sizeof(char); ++i)
          str[i] = __unnamed_string[i];
}

As you can see there is a difference. In the first case you're just storing a single pointer and in the second - you're copying the whole string into local.

Note: String literals are un-editable so you should store their address at a constant.

In N4296 - § 2.13.5 .8 states:

Ordinary string literals and UTF-8 string literals are also referred to as narrow string literals. A narrow string literal has type “array of n const char”, where n is the size of the string as defined below, and has static storage duration

The reason behind this decision is probably because this way, such arrays can be stored in read-only segments and thus optimize the program somehow. For more info about this decision see.

Note1:

In N4296 - § 2.13.5 .16 states:

Evaluating a string-literal results in a string literal object with static storage duration, initialized from the given characters as specified above.

Which means exactly what I said - for every string-literal an unnamed global object is created with their content.

Community
  • 1
  • 1
AnArrayOfFunctions
  • 3,452
  • 2
  • 29
  • 66