Strtok and its input

Question

i was using Strtok out of a while loop to split my input in three strings, eg:
input="Command path 'you are beautiful'" to split into:

tok1="Command" tok2="path" tok3="'you are beautiful'"

I can't use strtok three times in a row because tok3 would just be "'you".
My question is, what happens to the initial variable input when i use strtok?
After the first call of strtok i would like input to be "path 'you are beautiful'", and then after the second one just "'you are beautiful", so progessively reducing my initial string as i run strtok.
Is it possibile? If not, how can i do it?

Yes, `strtok` is changing the input string. If you want to keep an original - copy it somewhere. — Eugene Sh., Jul 28 '17 at 16:17
Yes, @EugeneSh., but `strtok()` does not modify the original string in the way the OP is hoping to achieve. — John Bollinger, Jul 28 '17 at 16:19
@iPhra, you seem not to appreciate `strtok()`'s mode of operation. It does not create copies of substrings of the input, but rather overwrites delimiters in the original string with string terminators, and returns a pointer to the starting character of each resulting segment. You could achieve what you describe with some work, but unless you also make copies of your tokens, it will not have the result you expect. — John Bollinger, Jul 28 '17 at 16:23
`strtok` is a pretty evil function.. I am wondering why it is not officially deprecated yet. — Eugene Sh., Jul 28 '17 at 16:27
Deleted and retyped my previous comment: you can change the delimitor for the third token to `"'"`. It does not need to be same as for the previous tokens. (It contained typos, sorry). It will strip the `'` but you can add them back when printing the tokens. — Weather Vane, Jul 28 '17 at 16:37
@WeatherVane i tried that but it also removed the final ', why is that? — iPhra, Jul 28 '17 at 16:50
[Similar question](https://stackoverflow.com/q/21896644/971127) — BLUEPIXY, Jul 28 '17 at 17:05

score 0 · Answer 1 · answered Jul 28 '17 at 18:01

Strtok behaviour is defined in the standard(http://pubs.opengroup.org/onlinepubs/009695399/functions/strtok.html) as follows:

A sequence of calls to strtok() breaks the string pointed to by s1 into a sequence of tokens, each of which is delimited by a byte from the string pointed to by s2. The first call in the sequence has s1 as its first argument, and is followed by calls with a null pointer as their first argument. The separator string pointed to by s2 may be different from call to call.

The first call in the sequence searches the string pointed to by s1 for the first byte that is not contained in the current separator string pointed to by s2. If no such byte is found, then there are no tokens in the string pointed to by s1 and strtok() shall return a null pointer. If such a byte is found, it is the start of the first token.

The strtok() function then searches from there for a byte that is contained in the current separator string. If no such byte is found, the current token extends to the end of the string pointed to by s1, and subsequent searches for a token shall return a null pointer. If such a byte is found, it is overwritten by a null byte, which terminates the current token. The strtok() function saves a pointer to the following byte, from which the next search for a token shall start.

Each subsequent call, with a null pointer as the value of the first argument, starts searching from the saved pointer and behaves as described above.

This means that you could just call strtok two times, and then determine the location just past the \0 of the second substring to get the third part you want.

However, this doesn't seem to be a reasonable way of doing this. It is inflexible, both in dealing with an error (like when the third substring is empty), and with potential future expansions. Furthermore, because of the design of the strtok interface, using it is not thread-safe at all.

It is probably a better idea to hand-code a small lexer/parser that does what you want, or use a tool specifically designed for building lexers (and parsers if needed). I personally have had good experiences with flex for this purpose, but there are other options.

Strtok and its input

1 Answers1