4

Given a string in C++ containing ranges and single numbers of the kind:

"2,3,4,7-9"

I want to parse it into a vector of the form:

2,3,4,7,8,9

If the numbers are separated by a - then I want to push all of the numbers in the range. Otherwise I want to push a single number.

I tried using this piece of code:

const char *NumX = "2,3,4-7";
std::vector<int> inputs;
std::istringstream in( NumX );
std::copy( std::istream_iterator<int>( in ), std::istream_iterator<int>(),
           std::back_inserter( inputs ) );

The problem was that it did not work for the ranges. It only took the numbers in the string, not all of the numbers in the range.

Jan Schultke
  • 17,446
  • 6
  • 47
  • 96
Efrat.shp
  • 117
  • 1
  • 9
  • 1
    Split the string into the two numbers. Then iterate from start to end adding the number. – Andrew Truckle Aug 17 '20 at 07:23
  • 1
    you can use [find first of](http://www.cplusplus.com/reference/string/string/find_first_of/) to find the range and [iota](https://en.cppreference.com/w/cpp/algorithm/iota) to fill it – yaodav Aug 17 '20 at 07:47
  • 1
    I suggest 2 passes. First seperate into separate blocks by searching commas. Then parse each block for hyphen – kesarling He-Him Aug 17 '20 at 07:49

4 Answers4

6

Your problem consists of two separate problems:

  1. splitting the string into multiple strings at ,
  2. adding either numbers or ranges of numbers to a vector when parsing each string

If you first split the whole string at a comma, you won't have to worry about splitting it at a hyphen at the same time. This is what you would call a Divide-and-Conquer approach.

Splitting at ,

This question should tell you how you can split the string at a comma.

Parsing and Adding to std::vector<int>

Once you have the split the string at a comma, you just need to turn ranges into individual numbers by calling this function for each string:

#include <vector>
#include <string>

void push_range_or_number(const std::string &str, std::vector<int> &out) {
    size_t hyphen_index;
    // stoi will store the index of the first non-digit in hyphen_index.
    int first = std::stoi(str, &hyphen_index);
    out.push_back(first);

    // If the hyphen_index is the equal to the length of the string,
    // there is no other number.
    // Otherwise, we parse the second number here:
    if (hyphen_index != str.size()) {
        int second = std::stoi(str.substr(hyphen_index + 1), &hyphen_index);
        for (int i = first + 1; i <= second; ++i) {
            out.push_back(i);
        }
    }
}

Note that splitting at a hyphen is much simpler because we know there can be at most one hyphen in the string. std::string::substr is the easiest way of doing it in this case. Be aware that std::stoi can throw an exception if the integer is too large to fit into an int.

Jan Schultke
  • 17,446
  • 6
  • 47
  • 96
4

All very nice solutions so far. Using modern C++ and regex, you can do an all-in-one solution with only very few lines of code.

How? First, we define a regex that either matches an integer OR an integer range. It will look like this

((\d+)-(\d+))|(\d+)

Really very simple. First the range. So, some digits, followed by a hyphen and some more digits. Then the plain integer: Some digits. All digits are put in groups. (braces). The hyphen is not in a matching group.

This is all so easy that no further explanation is needed.

Then we call std::regex_search in a loop, until all matches are found.

For each match, we check, if there are sub-matches, meaning a range. If we have sub-matches, a range, then we add the values between the sub-matches (inclusive) to the resulting std::vector.

If we have just a plain integer, then we add only this value.

All this gives a very simple and easy to understand program:

#include <iostream>
#include <string>
#include <vector>
#include <regex>

const std::string test{ "2,3,4,7-9" };

const std::regex re{ R"(((\d+)-(\d+))|(\d+))" };
std::smatch sm{};

int main() {
    // Here we will store the resulting data
    std::vector<int> data{};

    // Search all occureences of integers OR ranges
    for (std::string s{ test }; std::regex_search(s, sm, re); s = sm.suffix()) {

        // We found something. Was it a range?
        if (sm[1].str().length())

            // Yes, range, add all values within to the vector  
            for (int i{ std::stoi(sm[2]) }; i <= std::stoi(sm[3]); ++i) data.push_back(i);
        else
            // No, no range, just a plain integer value. Add it to the vector
            data.push_back(std::stoi(sm[0]));
    }
    // Show result
    for (const int i : data) std::cout << i << '\n';
}

If you should have more questions, I am happy to answer.


Language: C++ 17 Compiled and tested with MS Visual Studio 19 Community Edition

A M
  • 14,694
  • 5
  • 19
  • 44
3

Apart from @J. Schultke's excellent example, I suggest the use of regexes in the following way:

#include <algorithm>
#include <iostream>
#include <regex>
#include <string>
#include <vector>

void process(std::string str, std::vector<int>& num_vec) {
    str.erase(--str.end());
    for (int i = str.front() - '0'; i <= str.back() - '0'; i++) {
        num_vec.push_back(i);                                                     
    }
}

int main() {
    std::string str("1,2,3,5-6,7,8");
    str += "#";
    std::regex vec_of_blocks(".*?\,|.*?\#");
    auto blocks_begin = std::sregex_iterator(str.begin(), str.end(), vec_of_blocks);
    auto blocks_end = std::sregex_iterator();
    std::vector<int> vec_of_numbers;
    for (std::sregex_iterator regex_it = blocks_begin; regex_it != blocks_end; regex_it++) {
        std::smatch match = *regex_it;
        std::string block = match.str();
        if (std::find(block.begin(), block.end(), '-') != block.end()) {
            process(block, vec_of_numbers);
        }
        else {
            vec_of_numbers.push_back(std::atoi(block.c_str()));
        }
    }
    return 0;
}

Of course, you still need a tad bit validation, however, this will get you started.

kesarling He-Him
  • 1,944
  • 3
  • 14
  • 39
  • Hm, there are always many different solutions possible. The usage of a regex is an excellent idea. However, I would suggest a simpler implementation. Maybe see my answer below. But as said. There are many possibilities. . . . – A M Aug 17 '20 at 20:22
0

Consider pre-process your number string and split them. In the following code, transform() would convert one of the delims, , - and +, into a space so that std::istream_iterator parse int successfully.

#include <cstdlib>
#include <algorithm>
#include <string>
#include <vector>
#include <iostream>
#include <sstream>

int main(void)
{
    std::string nums = "2,3,4-7,9+10";
    const std::string delim_to_convert = ",-+";  // , - and +
    std::transform(nums.cbegin(), nums.cend(), nums.begin(),
            [&delim_to_convert](char ch) {return (delim_to_convert.find(ch) != string::npos) ? ' ' : ch; });

    std::istringstream ss(nums);
    auto inputs = std::vector<int>(std::istream_iterator<int>(ss), {});

    exit(EXIT_SUCCESS);
}

Note that the code above can split only 1-byte length delims. You should refer to @d4rk4ng31 answer if you need more complex and longer delims.

John Park
  • 1,644
  • 1
  • 12
  • 17
  • You misunderstood the question. This doesn't parse ranges and would just fill the vector with `2 3 4 7 9 10`. Also you forgot a `std::` before `string` and you need to include `` for `std::istream_iterator`. – Jan Schultke Aug 17 '20 at 09:34
  • @J.Schultke, Your comment pointed out my mistake. You're right, 4-7 should be inserted into the vector 4, 5, 6, 7. Missing headers and omitting std:: came from pasting my codes. – John Park Aug 17 '20 at 09:39