1

My input file is like this:

C:\Users\DeadCoder\AppData\Local\CoCreate

I am making a tree and I need to abstract the names of directories while reading from input file with \ delimiter. Like in the above example, i need to abstract separately c:, users, DeadCoder, Appdata .... I hope every one understands the questions. Now Let us see the options that we got.

1- istringstream works perfectly fine for whitespace but not for \.

2- strtok() works on char. So I would have to change my string to char and I really don't want to do this.

3- Boost Tokenizer() This one seems interesting and I don't have any familiarity with it in the past except I just googled it a while ago. I copied the code and it is like this:

#include <boost/foreach.hpp>
#include <boost/tokenizer.hpp>
using namespace boost;

int main(){

    string tempStr;
    ifstream fin;
    fin.open("input.txt");
    int i=0;

    while (!fin.eof()){
        getline(fin,tempStr);
        char_separator<char> sep("\"); // error: missing terminating " character
        tokenizer<char_separator<char>> tokens(tempStr, sep);
        for (const auto& t : tokens) {
            cout << t << "." << endl;
        }
}

Now this gives the error that "error: boost/foreach.hpp: No such file or directory" can Someone help me here. And Is there any other better way to read the input file with \ delimiter. Please don't use extensive codes like class tokenizer() as I am still learning c++.

EDIT: I didn't have boost library installed therefore I was having this error. it would be much of favor if someone can explain a better way to tokenize string without installing a third library.

Best; DeadCoder.

DeadCoder
  • 87
  • 1
  • 2
  • 9
  • Have you installed `Boost` library ? – Mahesh Feb 22 '13 at 17:06
  • Nope. Isn't it by default there. – DeadCoder Feb 22 '13 at 17:06
  • It's a third party library. Download it from http://www.boost.org/ – Mahesh Feb 22 '13 at 17:07
  • Okay fine. Is there any other way to do this task, I mean I don't have to add any third library. – DeadCoder Feb 22 '13 at 17:08
  • There is a way. you can use c_str() function on string. but you want to avoid it. don't know why . – Arpit Feb 22 '13 at 17:11
  • 1
    Yes. Read the line from the file. Tokenize the string based on the delimeter. Store the tokens in std::vector. For reference see this thread. http://stackoverflow.com/questions/53849/how-do-i-tokenize-a-string-in-c – Mahesh Feb 22 '13 at 17:11
  • @Arpit This would change my string to char right. and then I can tokenize it easily.????????? – DeadCoder Feb 22 '13 at 17:12
  • yep it is.http://www.cplusplus.com/reference/string/string/c_str/ – Arpit Feb 22 '13 at 17:13
  • Getting in here because I think I see where this is going. You **cannot** use `strok()` on the result of `c_str()`. `c_str()` returns a `const char *` (meaning you aren't allowed to change it at all), and `strtok()` manipulates the string in place. – BoBTFish Feb 22 '13 at 17:19
  • @BoBTFish I have not tried it so can't say but \\ delimiter does work. – DeadCoder Feb 22 '13 at 17:27

3 Answers3

3

In C++ (and other language based on C) the \ character in a string or character literal is the escape character. It means it escapes the next character in the literal. This is so you can have, for example, a " inside a string at all. To have a \ inside a string literal you need to escape the backslash by having two of them: "\\".

You can read more about the valid escape sequences in C++ e.g. in this reference.


As for the problem with Boost, you need to tell the compiler where you installed it. This is done in the project properties of your IDE.


If you want to tokenize without using a third-party library such as Boost, there are a couple of ways. One way could be to use std::istringstream and std::getline. Another fo use the find and substr functions of the standard string class.

Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
  • Okay it worked but I really need to know why I have to use two \\ . Because you are not making sense to me in the above statement. – DeadCoder Feb 22 '13 at 17:17
  • @DeadCoder You know how to write a literal newline inside a string? You use `"\n"` right? This is the same as using `"\\"` to mean a single backslash. A backslash inside a character or string _literal_ is telling the compiler that the next character is a special character that means something else. – Some programmer dude Feb 22 '13 at 17:21
  • @DeadCoder You have to use two \\ because that's the way the language is defined. – James Kanze Feb 22 '13 at 17:21
  • @JamesKanze and Joachmin ...Thanks a lot. – DeadCoder Feb 22 '13 at 17:23
2

Any sort of generalized tokenizer here would be overkill. Just use std::find( s.begin(), s.end(), '\\' ) to find each separator, and the two iterator constructor of std::string to put it into a separate string. (Your compiler treats the first \ as an escape character.) Something like:

std::vector<std::string> fields;
std::string::const_iterator end = s.end();
std::string::const_iterator current = s.begin();
std::string::const_iterator next
        = std::find( current, end, '\\' ):
while ( next != end ) {
    fields.push_back( std::string( current, next ) );
    current = next + 1;
    next = std::find( current, end, '\\' );
}
fields.push_back( std::string( current, next ) );

should do the trick.

James Kanze
  • 150,581
  • 18
  • 184
  • 329
1
char_separator<char> sep("\") 
                          ^^^ You need to escape the \ . use "\\" 

the \ is used to indicate a escape sequence. But to escape that escape, you need other escape

Use this : char_separator<char> sep("\\")

To install the boost lib: Install Boost

Other choice:

getline(fin,tempStr);
char *cstr=new char[tempStr.length()+1];
strcpy(cstr,tempStr.c_str())

//... Now you can use strtok() on cstr
Arpit
  • 12,767
  • 3
  • 27
  • 40