2

Ok, so I'm working on a homework project in C++ and am running into an issue, and can't seem to find a way around it. The function is supposed to break an input string at user-defined delimiters and store the substrings in a vector to be accessed later. I think I got the basic parser figured out, but it doesn't want to split the last part of the input.

int main() {
    string input =  "comma-delim-delim&delim-delim";
    vector<string> result;
    vector<char> delims;
    delims.push_back('-');
    delims.push_back('&');
    int begin = 0;

    for (int i = begin; i < input.length(); i++ ){
       for(int j = 0; j < delims.size(); j++){
          if(input.at(i) == delims.at(j)){
           //Compares chars in delim vector to current char in string, and 
           //creates a substring from the beginning to the current position 
           //minus 1, to account for the current char being a delimiter.
              string subString = input.substr(begin, (i - begin));
              result.push_back(subString);
              begin = i + 1;
           }

The above code works fine for splitting the input code up until the last dash. Anything after that, because it doesn't run into another delimiter, it won't save as a substring and push into the result vector. So in an attempt to rectify the matter, I put together the following:

else if(input.at(i) == input.at(input.length())){
   string subString = input.substr(begin, (input.length() - begin));
   result.push_back(subString);
}

However, I keep getting out of bounds errors with the above portion. It seems to be having an issue with the boundaries for splitting the substring, and I can't figure out how to get around it. Any help?

CLange
  • 21
  • 4
  • I would recommend storing the index where the last split occurred and, at the end of your loop, if that value isn't the end of the string, push everything after the last split to your result. What you have now is somewhat dubious because it will consider anything that's the same char as the last char to match (aside from the out of bounds issue). – stephen.vakil Oct 04 '17 at 21:15
  • 1
    Have you tried adding a caboose? i.e. Append a delimiter to your string. – 2785528 Oct 04 '17 at 21:24
  • So you want to use two delimiters `-` and `&`? – Raindrop7 Oct 04 '17 at 21:25
  • @DOUGLASO.MOEN is absolutely right. Add something like `input.push_back(delims.at(0));` before your loop and everything should work perfectly (see it [here](https://ideone.com/AcCZ9u)). Conversely, you could also take the substring again after the loop using the last value of begin (but you'd have to be careful to check that you aren't already at the end of the string). – scohe001 Oct 04 '17 at 21:28

3 Answers3

1

In your code you have to remember that .size() is going to be 1 more than your last index because it starts at 0. so an array of size 1 is indexed at [0]. so if you do input.at(input.length()) will always overflow by 1 place. input.at(input.length()-1) is the last element. here is an example that is working for me. After your loops just grab the last piece of the string.

if(begin != input.length()){
    string subString = input.substr(begin,(input.length()-begin));
    result.push_back(subString);
}
0

Working from the code in the question I've substituted iterators so that we can check for the end() of the input:

int main() {
    string input = "comma-delim-delim&delim-delim";
    vector<string> result;
    vector<char> delims;
    delims.push_back('-');
    delims.push_back('&');
    auto begin = input.begin(); // use iterator

    for(auto ii = input.begin(); ii <= input.end(); ii++){
        for(auto j : delims) {
            if(ii == input.end() || *ii == j){
                string subString(begin,ii); // can construct string from iterators, of if ii is at end
                result.push_back(subString);
                if(ii != input.end())
                    begin = ii + 1;
                else
                    goto done;
            }
        }
    }
done:
    return 0;
}
wally
  • 10,717
  • 5
  • 39
  • 72
  • 1
    Why have you decided to use [goto](https://xkcd.com/292/) instead of a simple `break`? – scohe001 Oct 04 '17 at 21:47
  • @scohe001 The `break` will only step out of the inner `for` loop and the outer loop actually goes into the `end()` which means that `ii++` will not be allowed. – wally Oct 04 '17 at 21:48
  • Ahh my bad, I saw goto and got tunnel vision. Wouldn't a flag be [better C++ practice](https://stackoverflow.com/questions/46586/goto-still-considered-harmful) though? – scohe001 Oct 04 '17 at 21:50
  • 1
    @scohe001 This will probably not survive a code review, but the fruit of the `goto` tree is so sweet... – wally Oct 04 '17 at 21:51
0

This program uses std::find_first_of to parse the multiple delimiters:

int main() {
    string input = "comma-delim-delim&delim-delim";
    vector<string> result;
    vector<char> delims;
    delims.push_back('-');
    delims.push_back('&');
    auto begin = input.begin(); // use iterator

    for(;;) {
        auto next = find_first_of(begin, input.end(), delims.begin(), delims.end());
        string subString(begin, next); // can construct string from iterators
        result.push_back(subString);
        if(next == input.end())
            break;
        begin = next + 1;
    }
}
wally
  • 10,717
  • 5
  • 39
  • 72