2

I'm currently trying to split up a text file into a vector of strings whenever a newline is encountered. Previously I have used boost tokenizer to do this with other delimiter characters but when I use the newline '\n' it throws an exception at runtime:

terminate called after throwing an instance of 'boost::escaped_list_error'
  what():  unknown escape sequence
Aborted

Here's the code:

std::vector<std::string> parse_lines(const std::string& input_str){
    using namespace boost;
    std::vector<std::string> parsed;
    tokenizer<escaped_list_separator<char> > tk(input_str, escaped_list_separator<char>('\n'));
    for (tokenizer<escaped_list_separator<char> >::iterator i(tk3.begin());
                i != tk.end(); ++i) 
    {
       parsed.push_back(*i);
    }
    return parsed;
}

Any advice greatly appreciated!

shuttle87
  • 15,466
  • 11
  • 77
  • 106
  • @AJG85 this produces the following error `textPanel.cpp:33: error: invalid conversion from ‘const char*’ to ‘char’ textPanel.cpp:33: error: initializing argument 1 of ‘boost::escaped_list_separator::escaped_list_separator(Char, Char, Char) [with Char = char, Traits = std::char_traits]’ ` – shuttle87 Apr 22 '11 at 15:34
  • Right it's been awhile since I used `boost::tokenizer` let me dust off the telecom socket server and see how I did it. – AJG85 Apr 22 '11 at 15:38

3 Answers3

4

escaped_list_separator's constructor expects the escape character, then the delimiter character, then the quote character. By using a newline as your escape character, its treating the first character in every line in your input as part of an escape sequence. Try this instead.

escaped_list_separator<char>('\\', '\n')

http://www.boost.org/doc/libs/1_46_1/libs/tokenizer/escaped_list_separator.htm

Null Set
  • 5,374
  • 24
  • 37
3

Given that the separator you want is already supported directly by the standard library, I think I'd skip using regexes for this at all, and use what's already present in the standard library:

std::vector<std::string> parse_lines(std::string const &input_string) { 
    std::istringstream buffer(input_string);
    std::vector<std::string> ret;
    std::string line;

    while (std::getline(buffer, line))
        ret.push_back(line);
    return ret;
}

Once you deal with the problem by treating the string as a stream and read lines from there, you have quite a few options about the details of how you go from there. Just for a couple of examples, you might want to use use the line proxy and/or LineInputIterator classes that @UncleBens and I posted in response to a previous question.

Community
  • 1
  • 1
Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
1

This might work better.

boost::char_separator<char> sep("\n");
boost::tokenizer<boost::char_separator<char>> tokens(text, sep);

Edit: Alternately you can use std::find and make your own splitter loop.

AJG85
  • 15,849
  • 13
  • 42
  • 50