trying to break a line into multiple tokens

Question

My problem is this, i have this string RGM 3 13 GName 0005 32 funny 0000 44 teste 0000\n and i want to split it like this

13 GName
32 funny 
44 teste

so i can save the numbers and names in an array, but the problem is for some reason declaring an "" like i did is invalid in c++ and it is breaking the line at all.

Program:

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <iostream>
#define line "RGM 3 13 GName 0005 32 funny 0000 44 teste 0000\n"

int main()
{
    char s[] = "";
    char* token;
    strtok(line,s);
    strtok(line,s);
    while( token != NULL )
    {
        printf( " %s\n", token );
        token = strtok(NULL,s);
    }
   
   return(0);
}

`strtok` is kind of deprecated in the C++ world. Also, it's not thread-safe. — digito_evo, Dec 20 '21 at 21:17
`char s[] = "";` what is the length of `s` here? (remember that primitive arrays have static size in C++) — scohe001, Dec 20 '21 at 21:19
Does this answer your question? [How do I iterate over the words of a string?](https://stackoverflow.com/questions/236129/how-do-i-iterate-over-the-words-of-a-string) — BoP, Dec 20 '21 at 21:28
You are mixing C++ (`#include `) and C (the rest of the code). That is a pretty bad idea. From what I see, you are using C, so maybe you should retag your question. — kebs, Dec 20 '21 at 21:28
For a C++ solution to these problems a good approach is the accepted answer in [this question](https://stackoverflow.com/a/61564761/1102805). — darcamo, Dec 20 '21 at 23:37

rturrado · Accepted Answer · 2021-12-21T13:11:16.667

If your lines are always going to have the format A B n1 str1 code1 ... nn strn coden, where you seem to discard A B, the simple C++ code below will suffice:

[Demo]

#include <iostream>  // cout
#include <sstream>  // istringstream
#include <string>

int main()
{
    const std::string line{"RGM 3 13 GName 0005 32 funny 0000 44 teste 0000"};

    std::istringstream iss{line};

    std::string token1{};
    std::string token2{};
    iss >> token1 >> token2; // get rid of header (RGM 3)

    std::string token3{};
    while (iss >> token1 >> token2 >> token3)
    {
        std::cout << token1 << " " << token2 << "\n";
    }
}

Notice this is doing almost no checks at all. Should you need more control over your input, something more advanced could be implemented.

For example, the code below, using regular expressions, would try to match each line header to a RGM m (RGM text and 1-digit m); then, it would search for groups of the form n str code (2-digit n, alphabetic str, 4-digit code):

[Demo]

#include <iostream>  // cout
#include <regex>  // regex_search, smatch
#include <string>

int main()
{
    std::string line{"RGM 3 13 GName 0005 32 funny 0000 44 teste 0000"};

    std::regex header_pattern{R"(RGM\s+\d)"};
    std::regex group_pattern{R"(\s+(\d{2})\s+([a-zA-Z]+)\s+\d{4})"};
    std::smatch matches{};
    if (std::regex_search(line, matches, header_pattern))
    {
        line = matches.suffix();
        while (std::regex_search(line, matches, group_pattern))
        {
            std::cout << matches[1] << " " << matches[2] << "\n";
            line = matches.suffix();
        }
    }
}

score 1 · Answer 2 · answered Dec 20 '21 at 23:28

strtok modifies the string passed to it, so passing a string literal is undefined behavior

so instead declare

char line[] = "RGM 3 13 GName 0005 32 funny 0000 44 teste 0000\n";

when you look at the prototype you get a hint about that

char* strtok( char* str, const char* delim );

so the first arg is not const in any way

trying to break a line into multiple tokens

2 Answers2