1

I have modified code from here replacing char[50] to char* in following code:

#include <stdio.h>
#include <string.h>
int main ()
{
  // change made in following line from char string[50]
  char *string ="Test,string1,Test,string2:Test:string3"; 
  char *p;
  printf ("String  \"%s\" is split into tokens:\n",string);
  p = strtok (string,",:");
  while (p!= NULL)
  {
    printf ("%s\n",p);
    p = strtok (NULL, ",:");
  }
  return 0;
}

However, I get segmentation fault with above code.

How can I use pointer version in above code?

Also, can segmentation fault cause damage to data on disk?

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
rnso
  • 23,686
  • 25
  • 112
  • 234

2 Answers2

3

In this declaration

char *string ="Test,string1,Test,string2:Test:string3"; 

there is defined a pointer that points to the first character of the string literal.

And then you are trying to use the pointer to change the string literal.

Take into account that the standard function strtok changes the passed to it string inserting the null terminating character at the point of the separator.

You may not change string literals in C (and C++). They are immutable. Any attempt to change a string literal results in undefined behavior.

Instead of the function strtok you could use functions strspn and strcspn to extract tokens. In this case you could process a string literal because these functions do not change passed to them strings.

Paul Ogilvie
  • 25,048
  • 4
  • 23
  • 41
Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
0

To answer your questions. This is why you receive a segfault:

There is a fundamental difference between a string literal (such as what is declared with char *string="Test,string1,Test,string2:Test:string3";) and a character array, which is the referenced version using char[50].

To provide a different persective of what is going wrong, here is what happens during compilation.

In both scenarios, the constant string "Test,string1,Test,string2:Test:string3" is stored in the readonly data section of the binary. When you use char *string, you are assigning the location (pointer) of the constant string (in .rodata) to a variable on the stack. When you use char string[50] you are actually declaring an array of chars as storage on the stack, not a char pointer. The compiler actually performs this assignment in a different manner than what you may expect. In many cases, it will add a function call such as memcpy to initialize the character array. Something like this:

char string [50]
memcpy(string,"Test,string1,Test,string2:Test:string3",0x32);

This has the advantage of creating a local stack variable that can be manipulated through further functions such as strtok. However, you certainly cannot use the same functions to manipulate the original string in the readonly section of the binary. That is the fundamental difference.

Everything else mentioned by @Vlad from Moscow is also relevant.

Next question: Also, can segmentation fault cause damage to data on disk?

A segmentation fault occurs when an operation (read, write, execute) occurs in a memory segment that does not allow such operations. Most often, this occurs from attempting to read from or write to the location referenced by an invalid pointer. This is entirely a runtime concept. The fault is contained within the virtual memory of the process. Generally speaking, this will not cause harm to any secondary storage. There may exist edge cases (a segfault occurred after partially writing data to a file) where file in secondary storage may be corrupted, but the example you have shown is not such a case. To summarize, your disk should be fine unless a segfault occurs in the middle of a write to disk.

h0r53
  • 3,034
  • 2
  • 16
  • 25
  • Thanks for a well explained answer. It did happen to me once many years back and disk data got severely corrupted (that program had disk write also). I did not touch C for all these years! – rnso Jun 19 '19 at 14:20