4

I need to convert a string to a char * for use in strtok_s and have been unable to figure it out. c_str() converts to a const char *, which is incompatible.

Also, if someone could explain to me why the second strtok_s function (inside the loop) is necessary, it'd be a great help. Why do i need to explicitly advance the token rather than, for example, the while loop it is in, which fetches each line of a file consecutively, implicitly.

while( getline(myFile, line) ) { // Only one line anyway. . . is there a better way?
    char * con = line.c_str();
    token = strtok_s( con, "#", &next_token);
    while ((token != NULL))
    {
        printf( " %s\n", token );
        token = strtok_s( NULL, "#", &next_token);
    }
}

related question.

Community
  • 1
  • 1
Nona Urbiz
  • 4,873
  • 16
  • 57
  • 84
  • Why are you using strtok() on C++ strings? C++ has better facilities for that sort of thing. – Avdi Oct 07 '09 at 15:33
  • 1
    because i know no better. what do you suggest? – Nona Urbiz Oct 07 '09 at 15:36
  • 1
    See http://stackoverflow.com/questions/53849/how-do-i-tokenize-a-string-in-c/55680#55680 for example code using Boost. – Bill Oct 07 '09 at 15:38
  • but why is that better? won't a library incur overhead? – Nona Urbiz Oct 07 '09 at 15:40
  • Pretty much everything in Boost is implemented as templates, so only the code that you actually use is included. – Martin B Oct 07 '09 at 15:45
  • 3
    By the way, C++ strings can have NUL characters in the middle of them, since C++ defines strings in terms of some bytes and a length, rather than C's "sequence of bytes terminated with NUL". So if all you know about the input is that it's a C++ string, C functions like `strtok` actually don't work, because they might falsely detect what they think is the end of the string, before the actual end. – Steve Jessop Oct 07 '09 at 18:40

10 Answers10

8

Use strdup() to copy the const char * returned by c_str() into a char * (remember to free() it afterwards)

Note that strdup() and free() are C, not C++, functions and you'd be better off using methods of std::string instead.

The second strtok_s() is needed because otherwise your loop won't terminate (token's value won't change).

Wernsey
  • 5,411
  • 22
  • 38
  • but why do i need to explicitly advance the token rather than for example, the while loop it is in, which fetches each line of a file consecutively, implicitly? – Nona Urbiz Oct 07 '09 at 15:53
  • Look at the code again: The first invocation of strtok() gets the first token from the line from the file. Then the while()'s condition checks if token is NULL. If it's not, the printf() is executed, and the next token is extracted. What confuses you is probably the fact that the variable next_token does not in fact store the next token, but rather the remainder of the line. That's just the way strtok_s() works. – Wernsey Oct 07 '09 at 16:12
  • `strdup` isn't standard C. It is simply a common extension, but by no means guaranteed to be present. – Evan Teran Oct 07 '09 at 22:58
5

As Daniel said, you could go with

strdup(line.c_str());

Which is better then the strcpy I originally proposed since it allocates the necessary space

Gab Royer
  • 9,587
  • 8
  • 40
  • 58
5

You can't convert to a char * because that would that would allow you to write to std::string's internal buffer. To avoid making std::string's implementation visible, this isn't allowed.

Instead of strtok, try a more "C++-like" way of tokenizing strings. See this question:

How do I tokenize a string in C++?

Community
  • 1
  • 1
Martin B
  • 23,670
  • 6
  • 53
  • 72
  • it seems hard to believe that this cast is simply impossible. – Nona Urbiz Oct 07 '09 at 15:34
  • 1
    The cast itself is possible using const_cast, but not at all suggested. – MP24 Oct 07 '09 at 15:36
  • It *is* impossible. The reason is that in object-oriented programming, objects don't like external clients accessing their internal representation directly. See http://en.wikipedia.org/wiki/Information_hiding – Martin B Oct 07 '09 at 15:37
  • 4
    You'll just have to believe it. The cast from const to non-const pointer is possible, but attempting to modify the data pointed to has undefined behavior. `c_str()` is not required to return the string's internal buffer - it could copy the string out to a new location and show you that. Obviously modifying any such clone of the original string would not work. In C++0x, the implementation of string is more tightly controlled, and IIRC you will be able to use `&line[0]` as a `char*` pointing to the string data. Although that might not be NUL-terminated. – Steve Jessop Oct 07 '09 at 15:45
2

strtok() is a badly designed function to begin with. Check your documentation to see if you have a better one. BTW, never use strtok() in any sort of threaded environment unless your docs specifically say it's safe, since it stores state in between calls and modifies the string it's called on. I assume strtok_s() is a safer version, but it's not going to be a really safe one.

To convert a std::string into the char *, you can do:

char * temp_line = new char[line.size() + 1];  // +1 char for '\0' terminator
strcpy(temp_line, line.c_str());

and use temp_line. Your installation may have a strdup() function, which will duplicate the above.

The reason you need two calls to strtok_s() is that they do different things. The first one tells strtok_s() what string it needs to work on, and the second one continues with the same string. That's the reason for the NULL argument; it tells strtok_s() to keep going with the original string.

Therefore, you need one call to get the first token, and then one for each subsequent token. They could be combined with something like

char * temp_string_pointer = temp_line;
while ((token = strtok_s( con, "#", &next_token)) != NULL)
{
   temp_string_pointer = NULL;

and so on, since that would call strtok_s() once with the string pointer and after that with NULL. Don't use temp_line for this, since you want to delete[] temp_line; after processing.

You may think this is a lot of fiddling around, but that's what strtok() and relatives usually entail.

David Thornley
  • 56,304
  • 9
  • 91
  • 158
  • I'd up-vote you for saying "`strtok()` is a badly designed function to begin with", but then you go and suggest using naked character buffers instead of some resource-managing object. `:(` – sbi Oct 09 '09 at 10:07
1

strtok works like this:

First call return string from beginning unril the delimiter or all the string if no delimiter were found:

token = strtok_s(con, "#", &next_token);

Second call using with NULL allow you to continue parsing the same string to find the next delimiter:

token = strtok_s(NULL, "#", &next_token);

If you reach the end of the string next call will return NULL;

Patrice Bernassola
  • 14,136
  • 6
  • 46
  • 59
  • but why do i need to explicitly advance the token rather than for example, the while loop it is in, which fetches each line of a file consecutively, implicitly. – Nona Urbiz Oct 07 '09 at 15:37
  • What exactly are you asking here? You have to call strtok repeatedly until you have consumed all the tokens from the data you give it, in this case one line of the file. The while loop checks the result from strtok to ensure this takes place. – Kylotan Oct 07 '09 at 16:09
1

Whenever you have a std::string and what you need is a (modifiable) character array, then std::vector<char> is what you need:

void f(char* buffer, std::size_t buffer_size);

void g(std::string& str)
{
  std::vector<char> buffer(str.begin(),str.end());
  // buffer.push_back('\0');    // use this if you need a zero-terminated string
  f(&buffer[0], buffer.size()); // if you added zero-termination, consider it for the size
  str.assign(buffer.begin(), buffer.end());
}
sbi
  • 219,715
  • 46
  • 258
  • 445
0

the 2nd strtok call is inside the loop. It advances you token pointer so that you print out tokens one-by-one, until you've printed out all of them, the pointer becomes null and you exit the loop.

To answer the 1st part of your question, as other have suggested, c_str() only gives you the internal buffer pointer - you can't modify that, that's why it's const. If you want to modify it, you need to allocate your own buffer and copy the string's contents into it.

azheglov
  • 5,475
  • 1
  • 22
  • 29
0

If you really need to access the string's internal buffer here is how: &*string.begin(). Direct access to string's buffer is useful in some cases, here you can see such a case.

Community
  • 1
  • 1
Cristian Adam
  • 4,749
  • 22
  • 19
  • 1
    I wouldn't want to do that. Fiddling with the internals of data structures is dangerous in general. – David Thornley Oct 07 '09 at 16:10
  • In theory, it is in this case, since `std::string` doesn't even guarantee that it stores its characters in a contiguous piece of memory (as, for example, `std::vector` does since C++03). In practice, nobody has seen an implementation of the class that does _not_ store its characters contiguously. As onebyone says, in C++1x this will be guaranteed. – sbi Oct 09 '09 at 10:09
  • Do these comments above mine conclude that this method is safe to use? It seems to easily solve my problem.. but I don't want to setup for failure... – bazz Aug 09 '14 at 20:02
0

You can easily write a conversion routine that will tokenize a string and return a vector of sub-strings:

std::vector<std::string> parse(const std::string& str, const char delimiter)
{
    std::vector<std::string> r;

    if(str.empty())
        return r;

    size_t prev = 0, curr = 0;

    do
    {
        if(std::string::npos == (curr = str.find(delimiter, prev)))
            curr = str.length();

        r.push_back(str.substr(prev, curr - prev));
        prev = curr + 1;
    }
    while(prev < (int)str.length());
    return r;
}
Chad
  • 18,706
  • 4
  • 46
  • 63
-1

I think you can first convert string to const char*, then copy the const char* to a char* buffer for further use.

Benny
  • 8,547
  • 9
  • 60
  • 93