0

I have this string that I'm trying to delimit with strtok():

"+ 2 3\\nln 9\\ln 10"

And then I'm passing \\n as a delimiter

    token = strtok(argv, "\\n");

    while (token)
    {
            args[index] = token;
            token = strtok(NULL, "\\n");
            index++;
    }

First element in my args table is + 2 3 ,which is great, however, the second one is l. Does anyone understand why ? If so, how do I get my ln 9 in my args table?

Avocado
  • 93
  • 8
  • 1
    The last argument is a list of characters that can act as delimiters. So `"\\n"` means you want to split it any time you find a backslash or an n – Jerry Jeremiah Apr 16 '20 at 09:08
  • sorry I edited the string, it came out differently than expected. I want to split any time there is an occurrence of \\n, but it actually does what you just said – Avocado Apr 16 '20 at 09:13
  • You are separating your string at each point it finds a backslash `'\'` or when it finds an `'n'` character. If you want it to delimit your string at a newline character, use `"\n"` as the string literal to pass to `strtok(3)`. – Luis Colorado Apr 18 '20 at 18:23

2 Answers2

2

From strtok() manpage:

The delim argument specifies a set of bytes that delimit the tokens in the parsed string. The caller may specify different strings in delim in successive calls that parse the same string.

So, in your code, "\\n" is not a full string delimiter. You are just saying to strtok that the delimiter is either '\' (because of the double backspace escaping) or 'n'.

The tokens of your string, "+ 2 3\\nln 9\\ln 10" will be:

  1. "+ 2 3"
  2. empty string between \ and \ (strtok doesn't present it)
  3. empty between \ and n (strtok doesn't present it)
  4. "l"
  5. " 9"
  6. empty string between \ and \ (strtok doesn't present it)
  7. "l"
  8. " 10"

In order to perform what you are trying to do, strtok is not the best choice. I would probably write my own parsering function

  1. Finding "\\n" occurrences in original string using strstr()
  2. Either copying the previous string to some output string or null terminating it in place
Roberto Caboni
  • 7,252
  • 10
  • 25
  • 39
  • Sorry, my string is "+ 2 3\\nln 9\\ln 10". I have edited my post – Avocado Apr 16 '20 at 09:14
  • @Avocado This doesn't change the analysis. You will have multiple empty strings instead of one. – Roberto Caboni Apr 16 '20 at 09:19
  • How do I tell `strtok` that the delimiter is `\\n` then? – Avocado Apr 16 '20 at 09:20
  • @Avocado You can't. It accepts only a set o **single characters** delimiters. – Roberto Caboni Apr 16 '20 at 09:21
  • @Avocado you don't. If you read the linked manpage again, there is no way to provide a delimiter containing more than 1 character. Each character in the string is handled separately. That is why the answer mentions `strstr` as a solution – Gerhardh Apr 16 '20 at 09:21
  • What alternatives do I have in order to get my result? – Avocado Apr 16 '20 at 09:22
  • @Avocado I wrote a little suggestion in the end of my answer – Roberto Caboni Apr 16 '20 at 09:23
  • 1
    Try this https://stackoverflow.com/questions/29788983/split-char-string-with-multi-character-delimiter-in-c And there is this one: https://stackoverflow.com/questions/34942523/split-string-by-a-substring but ignore the accepted answer which is completely wrong - the other answers may work for you though. – Jerry Jeremiah Apr 16 '20 at 09:28
1

I completely agree with above answer and suggestions by Roberto and Gerhardh.

In case if you are fine with custom implementation of strtok for multiple delimeter, you can use below working solution.

char *strtokm(char *str, const char *delim)
{
    static char *tok;
    static char *next;
    char *m;

    if (delim == NULL) return NULL;

    tok = (str) ? str : next;
    if (tok == NULL) return NULL;

    m = strstr(tok, delim);

    if (m) {
        next = m + strlen(delim);
        *m = '\0';
    } else {
        next = NULL;
    }

    return tok;
}
Shahid Hussain
  • 1,599
  • 1
  • 20
  • 24