0

I am creating a function that splits a sentence into words, and believe the way to do this is to use str.substr, starting at str[0] and then using str.find to find the index of the first " " character. Then update the starting position parameter of str.find to start at the index of that " " character, until the end of str.length().

I am using two variables to mark the beginning position and end position of the word, and update the beginning position variable with the ending position of the last. But it is not updating as desired in the loop as I currently have it, and cannot figure out why.

#include <iostream>
#include <string>
using namespace std;
void splitInWords(string str);



int main() {
    string testString("This is a test string");
    splitInWords(testString);

    return 0;
}



void splitInWords(string str) {
    int i;
    int beginWord, endWord, tempWord;
    string wordDelim = " ";
    string testWord;

    beginWord = 0;
    for (i = 0; i < str.length(); i += 1) {
        endWord = str.find(wordDelim, beginWord);
        testWord = str.substr(beginWord, endWord);
        beginWord = endWord;

        cout << testWord << " ";
        }
}
ajt45v
  • 25
  • 4
  • Do try and kick the `using namespace std` habit. This might seem like a convenience but really it just creates a ton of ambiguity and can lead to problems if you accidentally use one of the reserved functions or structures defined in `std`. – tadman May 26 '20 at 02:32
  • 1
    see [Why is “using namespace std;” considered bad practice?](https://stackoverflow.com/q/1452721/14065) – Martin York May 26 '20 at 02:42
  • 1
    `and believe the way to do this is to use str.substr, starting at str[0] and then using str.find to find the index`: Sure that is one way. But you can use the iostream functionality to read into a string. This will automatically split the string by space into words (using operator>>). – Martin York May 26 '20 at 02:44

2 Answers2

2

It is easier to use a string stream.

     #include <vector>
     #include <string>
     #include <sstream>
     using namespace std;

     vector<string> split(const string& s, char delimiter)
            {
                    vector<string> tokens;
                    string token;
                    istringstream tokenStream(s);
                    while (getline(tokenStream, token, delimiter))
                    {
                            tokens.push_back(token);
                    }
                    return tokens;
            }

        int main() {
           string testString("This is a test string");
           vector<string> result=split(testString,' ');
           return 0;
        }

You can write it using the existing C++ libraries:

#include <string>
#include <vector>
#include <iterator>
#include <sstream>

int main()
{
    std::string testString("This is a test string");
    std::istringstream wordStream(testString);

    std::vector<std::string> result(std::istream_iterator<std::string>{wordStream},
                                    std::istream_iterator<std::string>{});
}
Martin York
  • 257,169
  • 86
  • 333
  • 562
Hikmat Farhat
  • 552
  • 3
  • 9
  • You don't need to write the split function. `std::istringstream wordStream(testString);std::vector result(std::istream_iterator{wordStream}, std::istream_iterator{});` – Martin York May 26 '20 at 21:18
  • sure you can. I was trying to write code as close as possible to the original. Also, as far as i know, one needs a lot of acrobatics to make it work with a delimiter other than space. – Hikmat Farhat May 27 '20 at 12:19
1

Couple of issues:

  1. The substr() method second parameter is a length (not a position).

    // Here you are using `endWord` which is a poisition in the string.
    // This only works when beginWord is 0
    // for all other values you are providing an incorrect len.
    testWord = str.substr(beginWord, endWord); 
    
  2. The find() method searches from the second paramer.

    // If str[beginWord] contains one of the delimiter characters
    // Then it will return beginWord
    // i.e. you are not moving forward.
    endWord = str.find(wordDelim, beginWord);
    
    // So you end up stuck on the first space.
    
  3. Assuming you got the above fixed. You would be adding space at the front of each word.

    // You need to actively search and remove the spaces
    // before reading the words.
    

nice things you could do:

Here:

void splitInWords(string str) {

You are passing the parameter by value. This means you are making a copy. A better technique would be to pass by const reference (you are not modifying the original or the copy).

void splitInWords(string const& str) {

An Alternative

You can use the stream functionality.

void split(std::istream& stream)
{
    std::string word;
    stream >> word;     // This drops leading space.
                        // Then reads characters into `word`
                        // until a "white space" character is
                        // found.
                        // Note: it emptys words before adding any
}
Martin York
  • 257,169
  • 86
  • 333
  • 562
  • As you have likely guessed from the nature of my question, I am a student in a class teaching coding with C++. So, the kind of response you provided is really helpful to me - doesn't give me the whole answer, but points out what my code, as written, is doing, and some possible paths forward. Being in a class, I am not allowed to include libraries we haven't covered yet, so in this problem I am limited to , , and . – ajt45v May 26 '20 at 15:20