3

I am trying to split up a string by comma and populate a vector. The current code works for the first index, however, for the next iteration, the iterator ignores the the comma, but understands the one after that. Can anyone tell me why that is?

    getline(file,last_line);
    string Last_l = string(last_line);
    cout<< "String Lastline worked "<< Last_l <<endl;
    int end = 0;
    int start = 0;
    vector<string> linetest{};

    for(char &ii : Last_l){
        if( ii != ','){
            end++;
        }
        else{
            linetest.push_back(Last_l.substr(start,end));
//            Disp(linetest);
            cout<< Last_l.substr(start,end) <<endl;
            end++;
            start = end;
        }

    }
bolov
  • 72,283
  • 15
  • 145
  • 224
  • Please check the following [link](https://stackoverflow.com/questions/5607589/right-way-to-split-an-stdstring-into-a-vectorstring) for a concise solution. – sanoj subran May 14 '20 at 02:06
  • 1
    Does this answer your question? [Splitting a string by a character](https://stackoverflow.com/questions/10058606/splitting-a-string-by-a-character) – bhristov May 14 '20 at 02:07

2 Answers2

2

Based on your code, I think you are misunderstanding the parameters passed to substr. Note that the 2nd index is the count of characters after the first parameter, not the end index of the sub-string.

With that in mind, in the else condition, instead of:

end++;  // increment end index
start = end;  // reset start index

you would need to do something like:

start = end + 1;  // reset start index
end = 0;  // reset count of chars

Also, don't forget to add the extra string that's left over after the last comma, after the loop ends:

linetest.push_back(Last_l.substr(start + end));  // all the remaining chars

Here's the full snippet:

for(char &ii : Last_l){
        if( ii != ','){
            end++;
        }
        else{
            linetest.push_back(Last_l.substr(start,end));
            start = end + 1;
            end = 0;
        }
}

linetest.push_back(Last_l.substr(start + end));

and a working demo.

If you rename end to count it will make more sense why this works.

Also, please avoid using namespace std;, as it is considered bad practice.

cigien
  • 57,834
  • 11
  • 73
  • 112
1

So, a good answer is already given.

I want to show some alternative solutions

Splitting a string into tokens is a very old task. There are many many solutions available. All have different properties. Some are difficult to understand, some are hard to develop, some are more complex, slower or faster or more flexible or not.

Alternatives

  1. Handcrafted, like yours, maybe hard to develop and error prone. See your question . . .
  2. Using old style std::strtok function. Maybe unsafe. Maybe should not be used any longer
  3. std::getline. Most used implementation. But actually a "misuse" and not so flexible
  4. Using dedicated modern function, specifically developed for this purpose, most flexible and good fitting into the STL environment and algortithm landscape. But slower

Please see 4 examples in one piece of code.

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <regex>
#include <algorithm>
#include <iterator>
#include <cstring>
#include <forward_list>
#include <deque>

using Container = std::vector<std::string>;
std::regex delimiter{ "," };


int main() {

    // Some function to print the contents of an STL container
    auto print = [](const auto& container) -> void { std::copy(container.begin(), container.end(),
        std::ostream_iterator<std::decay<decltype(*container.begin())>::type>(std::cout, " ")); std::cout << '\n'; };

    // Example 1:   Handcrafted -------------------------------------------------------------------------
    {
        // Our string that we want to split
        std::string stringToSplit{ "aaa,bbb,ccc,ddd" };
        Container c{};

        // Search for comma, then take the part and add to the result
        for (size_t i{ 0U }, startpos{ 0U }; i <= stringToSplit.size(); ++i) {

            // So, if there is a comma or the end of the string
            if ((stringToSplit[i] == ',') || (i == (stringToSplit.size()))) {

                // Copy substring
                c.push_back(stringToSplit.substr(startpos, i - startpos));
                startpos = i + 1;
            }
        }
        print(c);
    }

    // Example 2:   Using very old strtok function ----------------------------------------------------------
    {
        // Our string that we want to split
        std::string stringToSplit{ "aaa,bbb,ccc,ddd" };
        Container c{};

        // Split string into parts in a simple for loop
#pragma warning(suppress : 4996)
        for (char* token = std::strtok(const_cast<char*>(stringToSplit.data()), ","); token != nullptr; token = std::strtok(nullptr, ",")) {
            c.push_back(token);
        }

        print(c);
    }

    // Example 3:   Very often used std::getline with additional istringstream ------------------------------------------------
    {
        // Our string that we want to split
        std::string stringToSplit{ "aaa,bbb,ccc,ddd" };
        Container c{};

        // Put string in an std::istringstream
        std::istringstream iss{ stringToSplit };

        // Extract string parts in simple for loop
        for (std::string part{}; std::getline(iss, part, ','); c.push_back(part))
            ;

        print(c);
    }

    // Example 4:   Most flexible iterator solution  ------------------------------------------------

    {
        // Our string that we want to split
        std::string stringToSplit{ "aaa,bbb,ccc,ddd" };


        Container c(std::sregex_token_iterator(stringToSplit.begin(), stringToSplit.end(), delimiter, -1), {});
        //
        // Everything done already with range constructor. No additional code needed.
        //

        print(c);


        // Works also with other containers in the same way
        std::forward_list<std::string> c2(std::sregex_token_iterator(stringToSplit.begin(), stringToSplit.end(), delimiter, -1), {});

        print(c2);

        // And works with algorithms
        std::deque<std::string> c3{};
        std::copy(std::sregex_token_iterator(stringToSplit.begin(), stringToSplit.end(), delimiter, -1), {}, std::back_inserter(c3));

        print(c3);
    }
    return 0;
}
A M
  • 14,694
  • 5
  • 19
  • 44