2
myStr = input("Enter something - ")

// say I enter "Hi there" 

arrayStr = myStr.split()
print(arrayStr)

// Output: ['Hi', 'there']

What is the exact C++ equivalent of this code? (My aim is to further iterate over the array and perform comparisons with other arrays).

HarshDarji
  • 156
  • 8
  • 1
    Does this answer your question? [How do I iterate over the words of a string?](https://stackoverflow.com/questions/236129/how-do-i-iterate-over-the-words-of-a-string) – kiner_shah Jan 11 '22 at 11:13
  • There must be several thousand examples on how to split strings on (arbitrary) separators all over the Internet, if you just search a little. Including a few here on Stack Overflow. – Some programmer dude Jan 11 '22 at 11:14
  • I did look up. I did not find any nice, simple, readable and nooby solution. So I thought I should ask. – HarshDarji Jan 11 '22 at 11:17
  • 1
    Then do you know that the stream "input" operator `>>` separates on space? Do you know that there's an input stream class for strings ([`std::istringstream`](https://en.cppreference.com/w/cpp/io/basic_istringstream))? And do you know that there's an iterator class for input streams ([`std::istream_iterator`](https://en.cppreference.com/w/cpp/iterator/istream_iterator)) which uses the `>>` operator? And that there's a [`std::vector` constructor overload](https://en.cppreference.com/w/cpp/container/vector/vector) that takes a pair of iterators? – Some programmer dude Jan 11 '22 at 11:24
  • [Continued] Well now you know! ;) And with that information it's possible to do what you want (separate a string on space and put each sub-string as elements into a vector) with just a couple of rather simple statements. – Some programmer dude Jan 11 '22 at 11:26
  • 1
    All answers and all links to duplicate ignore that there is a dedicated and specialized functionality since many years in C++ for this purpose. If I see the complex answers given here, I wondering why. The most versatile and easy solution, working directly with a string and does not need istringstream and fits perfectly in the world of C++with iterators and algorithms is the `std::sregex_token_iterator`. With an ultra simple regex re="\w+", you can simply write `std::vector arrayStr(std::sregex_token_iterator(myStr.begin(), myStr.end(), re),{});` That's it. Forget about all the other stuff... – A M Jan 11 '22 at 15:02
  • @ArminMontigny For simple space delimited strings, regular expressions is way overkill. In fact, regular expressions is overkill in many situations where it's still used. Remember the "joke" about the programmer having one problem, decides to solve it with regular expressions, and now the programmer have *two* problems. ;) – Some programmer dude Jan 12 '22 at 06:46
  • Indeed. The `std::istream_iterator` would be sufficient for space delimited strings.Still we would need a `std::istringstream`. For reading huge amount of data the regex solution would be too slow. For simple input it does not matter. I hope that the compile time regex library will make it to the standard soon . . . – A M Jan 12 '22 at 10:28

4 Answers4

2

One way of doing this would be using std::vector and std::istringstream as shown below:

#include <iostream>
#include <string>
#include<sstream>
#include <vector>
int main()
{
    std::string input, temp;
    //take input from user 
    std::getline(std::cin, input);
    
    //create a vector that will hold the individual words
    std::vector<std::string> vectorOfString;
    
    std::istringstream ss(input);
    
    //go word by word 
    while(ss >> temp)
    {
        vectorOfString.emplace_back(temp);
    }
    
    //iterate over all elements of the vector and print them out  
    for(const std::string& element: vectorOfString)
    {
        std::cout<<element<<std::endl;
    }
    return 0;
}
Jason
  • 36,170
  • 5
  • 26
  • 60
  • Can I instead accept the string as a vector and iterate over it? – HarshDarji Jan 11 '22 at 11:37
  • @HarshDarji Unfortunately no. – Jason Jan 11 '22 at 12:11
  • Why sooo complicated? Why not use the `std::sregex_token_iterator` to immedialtely split the string without the istringstream. Or at least the `std::istream_iterator`? Why not using CTAD? You could write `std::vector vectorOfString(std::istream_iterator(ss), {});`. That's it. And no, using iterators in C++ is not complicated . . . – A M Jan 11 '22 at 15:10
1

You can use string_views to avoid generating copies of the input string (efficient in memory), it literally will give you views on the words in the string, like this :

#include <iostream>
#include <string_view>
#include <vector>

inline bool is_delimiter(const char c)
{
    // order by frequency in your input for optimal performance
    return (c == ' ') || (c == ',') || (c == '.') || (c == '\n') || (c == '!') || (c == '?');
}

auto split_view(const char* line)
{
    const char* word_start_pos = line;
    const char* p = line;
    std::size_t letter_count{ 0 };
    std::vector<std::string_view> words;

    // while parsing hasn't seen the terminating 0
    while(*p != '\0')
    {
        // if it is a character from a word then start counting the letters in the word
        if (!is_delimiter(*p))
        {
            letter_count++;
        }
        else
        {
            //delimiter reached and word detected
            if (letter_count > 0)
            {
                //add another string view to the characters in the input string
                // this will call the constructor of string_view with arguments const char* and size
                words.emplace_back(word_start_pos, letter_count);

                // skip to the next word
                word_start_pos += letter_count;
            }

            // skip delimiters for as long as you encounter them
            word_start_pos++;
            letter_count = 0ul;
        }

        // move on to the next character
        ++p;
    }

    return words;
}

int main()
{
    auto words = split_view("the quick brown fox is fast. And the lazy dog is asleep!");
    for (const auto& word : words)
    {
        std::cout << word << "\n";
    }

    return 0;
}
Pepijn Kramer
  • 9,356
  • 2
  • 8
  • 19
0
#include <string>
#include <sstream>
#include <vector>
#include <iterator>

template <typename Out>
void split(const std::string &s, char delim, Out result) {
    std::istringstream iss(s);
    std::string item;
    while (std::getline(iss, item, delim)) {
        *result++ = item;
    }
}

std::vector<std::string> split(const std::string &s, char delim) {
    std::vector<std::string> elems;
    split(s, delim, std::back_inserter(elems));
    return elems;
}

std::vector<std::string> x = split("one:two::three", ':');

Where 'x' is your converted array with 4 elements.

UserOfStackOverFlow
  • 108
  • 1
  • 3
  • 14
0

Basically @AnoopRana's solution but using STL algorithms and removing punctuation signs from words:

[Demo]

#include <cctype>  // ispunct
#include <algorithm>  // copy, transform
#include <iostream>  // cout
#include <iterator>  // istream_iterator, ostream_iterator
#include <sstream>  // istringstream
#include <string>
#include <vector>

int main() {
    const std::string s{"In the beginning, there was simply the event and its consequences."};
    std::vector<std::string> ws{};
    std::istringstream iss{s};
    std::transform(std::istream_iterator<std::string>{iss}, {},
        std::back_inserter(ws), [](std::string w) {
            w.erase(std::remove_if(std::begin(w), std::end(w),
                        [](unsigned char c) { return std::ispunct(c); }),
                    std::end(w));
            return w;
    });
    std::copy(std::cbegin(ws), std::cend(ws), std::ostream_iterator<std::string>{std::cout, "\n"});
}

// Outputs:
//
//   In
//   the
//   beginning
//   there
//   was
//   simply
//   the
//   event
//   and
//   its
//   consequences
rturrado
  • 7,699
  • 6
  • 42
  • 62