3

When using:

string s;
cin >> s;

Which characters can string contain and which characters will stop the reading to string.

kravemir
  • 10,636
  • 17
  • 64
  • 111

2 Answers2

10

std::ctype_base::space is the delimiter for std::istream which makes it stop reading further character from the source.

std::ctype_base::space refers to whitespace and newline. That means, s can contain any character except whitespace and newline, when reading using cin>>s.

If you want to read complete line containing whitespaces as well, then you can use getline() function which uses newline as delimiter. There also exists its overloaded function, which you can use if you want to provide your own delimiter. See it's documentation for further detail.


You can also use customized locale which you can set to std::istream. Your customized locale can define a set of characters to be treated as delimiter by std::istream. You can see one such example here (see my solution):

Right way to split an std::string into a vector<string>

Community
  • 1
  • 1
Nawaz
  • 353,942
  • 115
  • 666
  • 851
  • I think I see what you're trying to say, but `std::ctype_base::space` is an `enum`, and is a mask value, which can really only be used as an argument to `std::ctype<>::is`. Which in turn depends on the imbued locale, which can in principle do anything. – James Kanze May 23 '11 at 09:06
  • @James: Hmm.. I understand. My edited version (after the horizontal line) probably explains that better? :-/ – Nawaz May 23 '11 at 09:08
  • 1
    Yes, although I'd consider the solution you're referring to abuse. The important things to remember are 1) white space is the separator, and 2) what is considered white space is locale dependent, on the locale imbued in the stream. – James Kanze May 23 '11 at 10:51
3

The delimiter is any character ch for which std::isspace( ch, std::sin.getlocale() ) returns true. In other words, whatever the imbued locale considers "white space". (Although I would consider it somewhat abuse, I've known programmers to create special locales, which consider e.g. , white space, and use >> to read a comma separated list.)

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • I think I've abused it here (is it really abuse?) : http://stackoverflow.com/questions/5607589/right-way-to-split-an-stdstring-into-a-vectorstring – Nawaz May 23 '11 at 09:10
  • I think it is. Calling a comma white space is IMHO abuse, and confusing. It's also dangerous: the imbued locale is used for other things, like parsing numbers. (Think of the consequences if you're using a French locale, where `','` is also the decimal separator.) I generally prefer a manipulator for this. The separator should be locale dependent. (Where the comma is the decimal point, semi-colon is usually used as separator.) But we really need a separate entry for it. (It's an issue with `std::complex`, whose `<<` operator is unusable.) – James Kanze May 23 '11 at 10:56
  • @James: How imbued locale is used to parse numbers? Could you explain that? – Nawaz May 23 '11 at 11:01
  • @Nawaz Numbers are parsed by using the `num_get` facet of the imbued locale. – James Kanze May 23 '11 at 13:00
  • @James: I didn't get that. Let me rephrase my question: if I were to read only numbers from a text file, how would I do that? – Nawaz May 23 '11 at 13:08
  • @Nawaz `file >> number`. It's the implementation of `operator>>` which uses `num_get`, not you directly. And it uses the `num_get` facet from the imbued locale. Which may have `','` as the decimal point, rather than `'.'`. Which could lead to some interesting interactions if the `ctype` facet declares that `','` is white space. – James Kanze May 23 '11 at 14:14
  • @James: Will `file >> number` ignore the non-numeric character such as simple english sentences? – Nawaz May 23 '11 at 14:33
  • @Nawaz Not necessarily ignore. It will skip spaces (according to the imbued locale), then attempt to collect characters to create a number. It will stop once it encounters a character which can't be part of the number it is trying to read, but it does require that a legal numeric sequence be present. It will not simply skip ahead until it finds a number, skipping all sorts of alpha, punctuation, etc. – James Kanze May 23 '11 at 15:10
  • @James: That means its not possible to *elegantly* read only numbers from a file which contains both, number and non-number, without abusing the locale. – Nawaz May 23 '11 at 15:14
  • @James: see this : [How to read integers elegantly using C++ stream?](http://stackoverflow.com/questions/4767920/how-to-read-integers-elegantly-using-c-stream) . I think its really elegant solution! – Nawaz May 23 '11 at 15:16
  • @Nawaz I'm not sure I understand. Any file you're reading will have some sort of defined format. You have to read all of the data to determine whether it conforms to that format or not. There's nothing elegant about reading some random data, and just extracting anything which happens to look like an integer. – James Kanze May 23 '11 at 15:44
  • @Nawaz I certainly don't consider the link you gave elegant; far from it. But if you really want to skip or remap characters, using a filtering streambuf can do the trick, even more elegantly (since the name of the class can be chosen to say exactly what is going on). – James Kanze May 23 '11 at 15:46
  • @James: Using streambuf? How exactly? :-/ – Nawaz May 23 '11 at 15:49
  • @Nawaz Using a filtering streambuf. There's a variant in `boost::iostream`; the Boost site also seems to have my original article (http://lists.boost.org/Archives/boost/att-49459/fltrsbf1.htm) on the subject. – James Kanze May 23 '11 at 16:51