-5

I am working on a algorithm where I am trying the following output:

Given values/Inputs:

char *Var = "1-5,10,12,15-16,25-35,67,69,99-105"; int size = 29;

Here "1-5" depicts a range value, i.e. it will be understood as "1,2,3,4,5" while the values with just "," are individual values.

I was writing an algorithm where end output should be such that it will give complete range of output as:

int list[]=1,2,3,4,5,10,12,15,16,25,26,27,28,29,30,31,32,33,34,35,67,69,99,100,101,102,103,104,105;

If anyone is familiar with this issue then the help would be really appreciated. Thanks in advance!

My initial code approach was as:

if(NULL != strchr((char *)grp_range, '-'))
{
    int_u8 delims[] = "-";
    result = (int_u8 *)strtok((char *)grp_range, (char *)delims);

    if(NULL != result)
    {
        start_index = strtol((char*)result, (char **)&end_ptr, 10);
        result = (int_u8 *)strtok(NULL, (char *)delims);
    }

    while(NULL != result)
    {
        end_index = strtol((char*)result, (char**)&end_ptr, 10);
        result = (int_u8 *)strtok(NULL, (char *)delims);
    }

    while(start_index <= end_index)
    {
        grp_list[i++] = start_index;
        start_index++;
    }
}

else if(NULL != strchr((char *)grp_range, ','))
{
    int_u8 delims[] = ",";
    result = (unison_u8 *)strtok((char *)grp_range, (char *)delims);

    while(result != NULL)
    {
        grp_list[i++] = strtol((char*)result, (char**)&end_ptr, 10);
        result = (int_u8 *)strtok(NULL, (char *)delims);
    }
}

But it only works if I have either "0-5" or "0,10,15". I am looking forward to make it more versatile.

Amrit
  • 2,295
  • 4
  • 25
  • 42

7 Answers7

3

Here is a C++ solution for you to study.

#include <vector>
#include <string>
#include <sstream>
#include <iostream>
using namespace std;

int ConvertString2Int(const string& str)
{
    stringstream ss(str);
    int x;
    if (! (ss >> x))
    {
        cerr << "Error converting " << str << " to integer" << endl;
        abort();
    }
    return x;
}

vector<string> SplitStringToArray(const string& str, char splitter)
{
    vector<string> tokens;
    stringstream ss(str);
    string temp;
    while (getline(ss, temp, splitter)) // split into new "lines" based on character
    {
        tokens.push_back(temp);
    }
    return tokens;
}

vector<int> ParseData(const string& data)
{
    vector<string> tokens = SplitStringToArray(data, ',');

    vector<int> result;
    for (vector<string>::const_iterator it = tokens.begin(), end_it = tokens.end(); it != end_it; ++it)
    {
        const string& token = *it;
        vector<string> range = SplitStringToArray(token, '-');
        if (range.size() == 1)
        {
            result.push_back(ConvertString2Int(range[0]));
        }
        else if (range.size() == 2)
        {
            int start = ConvertString2Int(range[0]);
            int stop = ConvertString2Int(range[1]);
            for (int i = start; i <= stop; i++)
            {
                result.push_back(i);
            }
        }
        else
        {
            cerr << "Error parsing token " << token << endl;
            abort();
        }
    }

    return result;
}

int main()
{
    vector<int> result = ParseData("1-5,10,12,15-16,25-35,67,69,99-105");
    for (vector<int>::const_iterator it = result.begin(), end_it = result.end(); it != end_it; ++it)
    {
        cout << *it << " ";
    }
    cout << endl;
}

Live example

http://ideone.com/2W99Tt

Neil Kirk
  • 21,327
  • 9
  • 53
  • 91
  • Wow! well done mate. Thanks, let me convert it to C and check. – Amrit Sep 12 '13 at 12:42
  • 2
    @LetsCode Why would you want to convert it to C? Your question was tagged C++ – Neil Kirk Sep 12 '13 at 12:42
  • 1
    @LetsCode Well have fun with that. If you try converting that you’ll see that C and C++ are nothing alike. – Konrad Rudolph Sep 12 '13 at 13:00
  • That's a lot of lines of code for a very simple problem (but the basic approach is good). – James Kanze Sep 12 '13 at 13:14
  • @JamesKanze Some of the code may be reused as helpful utility functions. But how would you make it shorter? – Neil Kirk Sep 12 '13 at 13:35
  • 1
    @NeilKirk By keeping everything in a single `std::istringstream`, and passing a reference to that around, for starters. (Or inversely, by just using `std::string::const_iterator` pairs everywhere, and only converting to `std::istringstream` for the numerical conversion.) By treating the single number and range forms identically. But part of what gave me the impression that there were a lot of lines was the fact that you actually gave a complete program, with includes and everything. It's actually a pretty good solution. – James Kanze Sep 12 '13 at 15:01
  • @NeilKirk And now for the nits: if one of the fields is `"1abc"`, you won't generate an error. In `ConvertStringToInt`, you need to add a `>> std::ws` and verify that after than, `ss.get()` returns end of file. (And you really should be using `std::istringstream`, and not `std::stringstream`.) – James Kanze Sep 12 '13 at 15:03
  • @NeilKirk And an even nittier nit: Your naming convention is inconsistent: `ConvertString2Int`, but `SplitStringToArray`. (Of course, personally, I don't like repeating information that is directly visible in the signature. `convertToInt`, or simply `toInt`, and `split` are fine. The longer forms that you use are a holdover from C, where you needed different names for different types.) – James Kanze Sep 12 '13 at 15:06
  • @KonradRudolph Yes, you're right. I must've mistaken Cpp & C similarity. Both are quite different. – Amrit Sep 13 '13 at 05:34
  • @NeilKirk Thanks for your effort. Appreciate it! – Amrit Sep 13 '13 at 05:35
3

This is my boost approach :

This won't give you array of ints, instead a vector of ints

Algorithm used: (nothing new)

  • Split string using ,

  • Split the individual string using -

  • Make a range low and high

  • Push it into vector with help of this range

Code:-

#include<iostream>
#include<vector>
#include <boost/algorithm/string.hpp>
#include <boost/lexical_cast.hpp>


int main(){

    std::string line("1-5,10,12,15-16,25-35,67,69,99-105");
    std::vector<std::string> strs,r;
    std::vector<int> v;
    int low,high,i;
    boost::split(strs,line,boost::is_any_of(","));

for (auto it:strs)
{
    boost::split(r,it,boost::is_any_of("-"));

    auto x = r.begin();
    low = high =boost::lexical_cast<int>(r[0]);
    x++;
    if(x!=r.end())
        high = boost::lexical_cast<int>(r[1]);
    for(i=low;i<=high;++i)
      v.push_back(i);
}

for(auto x:v)
  std::cout<<x<<" ";

    return 0;

}
P0W
  • 46,614
  • 9
  • 72
  • 119
  • 1
    That's probably the solution I'd actually use. But then, I've had the equivalent of `boost::split` in my tool kit for ages (long before there was Boost). It makes the problem too simple. – James Kanze Sep 12 '13 at 13:19
  • Beautiful job with my issue! Thanks for your effort. Appreciate it! – Amrit Sep 13 '13 at 05:33
2

Don't search. Just go through the text one character at a time. As long as you're seeing digits, accumulate them into a value. If the digits are followed by a - then you're looking at a range, and need to parse the next set of digits to get the upper bound of the range and put all the values into your list. If the value is not followed by a - then you've got a single value; put it into your list.

Pete Becker
  • 74,985
  • 8
  • 76
  • 165
  • Nice thought! That's what I need. I will try it and let you know. – Amrit Sep 12 '13 at 12:33
  • That's actually an interesting solution. Once you've got the first number, if the next character is a `-`, skip it and collect the second. If it's not, make the range using the first number as both start and end point. Either way, the higher level logic only sees a list of ranges. – James Kanze Sep 12 '13 at 12:39
  • Yes. This is a "stack-based" approach, where you only ever have to have one number on the stack -- the last whole number read. The next character defines what happens: `,` = push onto list, `-` = read next, push list, binary 0 = end. – Jongware Sep 12 '13 at 12:39
  • I really liked your idea. It gave me inspiration to look at the problem in a simple way. Thanks for your effort. Appreciate it! – Amrit Sep 13 '13 at 05:35
2

You're issue seems to be misunderstanding how strtok works. Have a look at this.

#include <string.h>
#include <stdio.h>

int main()
{
        int i, j;
        char delims[] = " ,";
        char str[] = "1-5,6,7";
        char *tok;
        char tmp[256];
        int rstart, rend;

        tok = strtok(str, delims);

        while(tok != NULL) {
                for(i = 0; i < strlen(tok); ++i) {
                        //// range
                        if(i != 0 && tok[i] == '-') {
                                strncpy(tmp, tok, i); 
                                rstart = atoi(tmp);
                                strcpy(tmp, tok + i + 1); 
                                rend = atoi(tmp);
                                for(j = rstart; j <= rend; ++j)
                                        printf("%d\n", j); 
                                i = strlen(tok) + 1;
                        }   
                        else if(strchr(tok, '-') == NULL)
                                printf("%s\n", tok);
                }   

                tok = strtok(NULL, delims);
        }   

        return 0;
}
csnate
  • 1,601
  • 4
  • 19
  • 31
  • 1
    This exact example will crash, of course. `strtok`: just say no. – James Kanze Sep 12 '13 at 13:14
  • 1
    Then your compiler is doing something unusual. (I know that Sun CC does, for various historical reasons.) The type of `"1-5,6,7"` is `char const[]`, and most compilers will put it in write protected memory. – James Kanze Sep 12 '13 at 17:25
  • You were right, I had to change it from char *var to char var[]. See my above edit, when I originally tried I didn't seem to get a segfault. Once I wrote the full solution it failed. – csnate Sep 12 '13 at 17:42
2

Stop and think about it: what you actually have is a comma separated list of ranges, where a range can be either a single number, or a pair of numbers separated by a '-'. So you probably want to loop over the ranges, using recursive descent for the parsing. (This sort of thing is best handled by an istream, so that's what I'll use.)

std::vector<int> results;
std::istringstream parser( std::string( var ) );
processRange( results, parser );
while ( isSeparator( parser, ',' ) ) {
    processRange( results, parser );
}

with:

bool
isSeparator( std::istream& source, char separ )
{
    char next;
    source >> next;
    if ( source && next != separ ) {
        source.putback( next );
    }
    return source && next == separ;
}

and

void
processRange( std::vector<int>& results, std::istream& source )
{
    int first = 0;
    source >> first;
    int last = first;
    if ( isSeparator( source, '-' ) ) {
        source >> last;
    }
    if ( last < first ) {
        source.setstate( std::ios_base::failbit );
    } 
    if ( source ) {
        while ( first != last ) {
            results.push_back( first );
            ++ first;
        }
        results.push_back( first );
    }
}

The isSeparator function will, in fact, probably be useful in other projects in the future, and should be kept in your toolbox.

James Kanze
  • 150,581
  • 18
  • 184
  • 329
0

One approach:

You need a parser that identifies 3 kinds of tokens: ',', '-', and numbers. That raises the level of abstraction so that you are operating at a level above characters.

Then you can parse your token stream to create a list of ranges and constants.

Then you can parse that list to convert the ranges into constants.

Some code that does part of the job:

#include <stdio.h>

// Prints a comma after the last digit. You will need to fix that up.
void print(int a, int b) {
  for (int i = a; i <= b; ++i) {
    printf("%d, ", i);
  }
}

int main() {
  enum { DASH, COMMA, NUMBER };
  struct token {
    int type;
    int value;
  };

  // Sample input stream. Notice the sentinel comma at the end.
  // 1-5,10,
  struct token tokStream[] = {
    { NUMBER, 1 },
    { DASH, 0 },
    { NUMBER, 5 },
    { COMMA, 0 },
    { NUMBER, 10 },
    { COMMA, 0 } };

  // This parser assumes well formed input. You have to add all the error
  // checking yourself.
  size_t i = 0;
  while (i < sizeof(tokStream)/sizeof(struct token)) {
    if (tokStream[i+1].type == COMMA) {
      print(tokStream[i].value, tokStream[i].value);
      i += 2;  // skip to next number
    }
    else { // DASH
      print(tokStream[i].value, tokStream[i+2].value);
      i += 4;  // skip to next number
    }
  }

  return 0;
}
Adam Burry
  • 1,904
  • 13
  • 20
  • It's true that one could probably write a solution using lex and yacc in about 5 minutes. It would probably be the simplest to understand and maintain, as well. – James Kanze Sep 12 '13 at 12:36
  • Thanks for your reply. Your solution works but I will have to give input in a struct which is not what i am looking for. Any thanks for your effort. Appreciate it! – Amrit Sep 13 '13 at 05:32
0

First divide whole string into numbers and ranges (using strtok() with "," delimiter), save strings in array, then, search through array looking for "-", if it present than use sscanf() with "%d-%d" format, else use sscanf with single "%d" format.

Function usage is easily googling.

Oleg Olivson
  • 489
  • 2
  • 5
  • Do _not_ use `strtok`. There are better ways, always. In this case, in C, `strchr` would be a good choice. An even better choice, of course, would be to use `std::string`, and the functions in the standard library. – James Kanze Sep 12 '13 at 12:34
  • I understand that strtok() breaks string, but it's a raw solution, without any improvements and good tools. Meanwhile better solution has already written. – Oleg Olivson Sep 12 '13 at 12:43