2

I am using boost::tokenizer to get ';' separated fields from a string. I am able to retrieve the fields as shown in the code below but i have 2 questions:

  1. Is there any function which tokenizer provides to know the count of tokens in a string based on the separator provided?
  2. Supposing the test string has 3 fields a;b;c . The following piece of code will print all of them. But i need to print empty fields too. E.g. incase of a string a;;;b;c the token should also contain nothing as 2nd and 3rd element. Or in other words the 2nd and 3rd token should be empty.
#include <boost/tokenizer.hpp>
namespace std;
namespace boost;
int main()
{
    string data="a;;;;b;c";
    boost::char_separator<char> obj(";");
    boost::tokenizer<boost::char_separator<char> > tokens(data,obj);
    cout<<endl<<tokens.countTokens();
    for(boost::tokenizer<boost::char_separator<char> >::iterator it=tokens.begin();
    it!=tokens.end();
    ++it)
    {
        std::cout<<*it<<endl;
    }
}
Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
anurag86
  • 1,635
  • 1
  • 16
  • 31
  • 2
    your second question is answered here: http://stackoverflow.com/questions/22331648/boosttokenizer-point-seperated-but-also-keeping-empty-fields – m.s. Oct 29 '15 at 12:30

1 Answers1

7

1) You can just count difference between end and begin.

const size_t count = std::distance(tokens.begin(), tokens.end());

2) You should just construct separator right.

boost::char_separator<char> obj(";", "", boost::keep_empty_tokens);

Live example

ForEveR
  • 55,233
  • 2
  • 119
  • 133
  • Thanks. It worked. I just read that _space_ is considered as a separator by default. So `a;b; c;d` would fetch only a and b because it encounters two spaces after `b;` . Is there anyway i let tokenizer stop treating spaces as token separators? – anurag86 Oct 29 '15 at 12:53
  • i kept the 2nd parater as " ". But it is still not working.I did it this way : `boost::char_separator obj(";"," ",boost::keep_empty_tokens); ` – anurag86 Oct 29 '15 at 13:08
  • @anurag86 look, in my code there is no space, but just empty string. – ForEveR Oct 29 '15 at 13:17
  • I mean i got the answer for the 2 questions asked. I am asking now that if my string contains `;` separated fields if the fields itself contains space then the tokenizer thinks _space_ also as one of the separators. I dont want tokenizer to treat _space_ as a separator. How can i do it? – anurag86 Oct 29 '15 at 13:43
  • @anurag86 if you want to ignore space - just construct separator like "; " in first argument. – ForEveR Oct 29 '15 at 13:56