0

I am learning C++ for Competitive Programming. I recently came across a problem that requires a string to be split into a vector (I come from a Python and JavaScript background, so there's this easy built-in function that is responsible for splitting strings)

Is there something similar in C++? An easy way that saves time. I would appreciate your input

Thank you!

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
Hatem Saadallah
  • 130
  • 2
  • 10
  • 2
    You have a canonical answer here -> https://stackoverflow.com/questions/236129/how-do-i-iterate-over-the-words-of-a-string/236803#236803 – Captain Giraffe May 19 '20 at 00:13
  • Split a string based on a delimiter? You can put the string into a `std::stringstream` and use `std::getline()` with a custom delimiter, adding each component to a vector. – dreamlax May 19 '20 at 00:13
  • 1
    if you are splitting on spaces then `std::copy(std::istream_iterator(std::cin), std::istream_iterator(), std::back_inserter(words));` See https://onlinegdb.com/H1mqg6lo8 – Jerry Jeremiah May 19 '20 at 02:12
  • 1
    _I am learning C++ for Competitive Programming._ ... _An easy way that saves time._ A hint: You don't gain the required speed by clever data input. You gain it by chosing the right algorithm with a significant lesser complexity than the naive or brute force approach. (That's my impression from what I read here and there. I must admit I've never tried myself _Competitive Programming._) – Scheff's Cat May 19 '20 at 05:13
  • That's insightful for sure. Thank you – Hatem Saadallah May 19 '20 at 21:57

1 Answers1

5

Difficult to answer. Since competetive programming has not so much to do with real intention of C++.

Anyway.

Splitting a string into tokens is a very old task. There are many many solutions available. All have different properties. Some are difficult to understand, some are hard to develop, some are more complex, slower or faster or more flexible or not.

Alternatives

  1. Handcrafted, many variants, using pointers or iterators, maybe hard to develop and error prone.
  2. Using old style std::strtok function. Maybe unsafe. Maybe should not be used any longer
  3. std::getline. Most used implementation. But actually a "misuse" and not so flexible
  4. Using dedicated modern function, specifically developed for this purpose, most flexible and good fitting into the STL environment and algortithm landscape. But slower.

Please see 4 examples in one piece of code.

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <regex>
#include <algorithm>
#include <iterator>
#include <cstring>
#include <forward_list>
#include <deque>

using Container = std::vector<std::string>;
std::regex delimiter{ "," };


int main() {

    // Some function to print the contents of an STL container
    auto print = [](const auto& container) -> void { std::copy(container.begin(), container.end(),
        std::ostream_iterator<std::decay<decltype(*container.begin())>::type>(std::cout, " ")); std::cout << '\n'; };

    // Example 1:   Handcrafted -------------------------------------------------------------------------
    {
        // Our string that we want to split
        std::string stringToSplit{ "aaa,bbb,ccc,ddd" };
        Container c{};

        // Search for comma, then take the part and add to the result
        for (size_t i{ 0U }, startpos{ 0U }; i <= stringToSplit.size(); ++i) {

            // So, if there is a comma or the end of the string
            if ((stringToSplit[i] == ',') || (i == (stringToSplit.size()))) {

                // Copy substring
                c.push_back(stringToSplit.substr(startpos, i - startpos));
                startpos = i + 1;
            }
        }
        print(c);
    }

    // Example 2:   Using very old strtok function ----------------------------------------------------------
    {
        // Our string that we want to split
        std::string stringToSplit{ "aaa,bbb,ccc,ddd" };
        Container c{};

        // Split string into parts in a simple for loop
#pragma warning(suppress : 4996)
        for (char* token = std::strtok(const_cast<char*>(stringToSplit.data()), ","); token != nullptr; token = std::strtok(nullptr, ",")) {
            c.push_back(token);
        }

        print(c);
    }

    // Example 3:   Very often used std::getline with additional istringstream ------------------------------------------------
    {
        // Our string that we want to split
        std::string stringToSplit{ "aaa,bbb,ccc,ddd" };
        Container c{};

        // Put string in an std::istringstream
        std::istringstream iss{ stringToSplit };

        // Extract string parts in simple for loop
        for (std::string part{}; std::getline(iss, part, ','); c.push_back(part))
            ;

        print(c);
    }

    // Example 4:   Most flexible iterator solution  ------------------------------------------------

    {
        // Our string that we want to split
        std::string stringToSplit{ "aaa,bbb,ccc,ddd" };


        Container c(std::sregex_token_iterator(stringToSplit.begin(), stringToSplit.end(), delimiter, -1), {});
        //
        // Everything done already with range constructor. No additional code needed.
        //

        print(c);


        // Works also with other containers in the same way
        std::forward_list<std::string> c2(std::sregex_token_iterator(stringToSplit.begin(), stringToSplit.end(), delimiter, -1), {});

        print(c2);

        // And works with algorithms
        std::deque<std::string> c3{};
        std::copy(std::sregex_token_iterator(stringToSplit.begin(), stringToSplit.end(), delimiter, -1), {}, std::back_inserter(c3));

        print(c3);
    }
    return 0;
}
A M
  • 14,694
  • 5
  • 19
  • 44