3

I have an instance of stringstream that I am reading from. At a certain point of getting data out of the stream, I need to read an identifier that may or may not be there. The logic is something like this:

std::string identifier;
sstr >> identifier;
if( identifier == "SomeKeyword" )
    //process the the rest of the string stream using method 1
else
   // back up to before we tried to read "identifier" and process the stream using method 2

How can I achieve the above logic?

default
  • 2,637
  • 21
  • 44

3 Answers3

6

Use the stream's tellg() and seekg() methods, eg:

std::string identifier;
std::stringstream::pos_type pos = sstr.tellg();
sstr >> identifier;
if (identifier == "SomeKeyword")
{
    //process the rest of the string stream using method 1
}
else
{
    // back up to before we tried to read "identifier
    sstr.seekg(pos);
    // process the stream using method 2
}
Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • 1
    This is almost a verbatim copy of my answer except the nifty link to `seekg`. – Mad Physicist Dec 02 '15 at 19:11
  • 1
    Your answer had not been posted yet when I was still writing and testing my answer. I saw yours after posting mine. In my case, I thing using `pos_type` instead of `int` is more accurate, and using `beg` is redundant. – Remy Lebeau Dec 02 '15 at 19:11
  • Fair enough. I fixed my type, although I think beg improves readability. Your links are better too. – Mad Physicist Dec 02 '15 at 19:12
  • 3
    Note: As of C++11, `seekg()` clears the EOF bit before doing its thing, but pre-C++11, it inhibits the seek. So older platforms beware. There's also a potential efficiency problem in the "re-read" approach if the next item is a 10k character block of MIME data. – metal Dec 03 '15 at 22:18
4

You can get the get pointer in the stream before you get the identifier and restore the position if the identifier is wrong:

std::string identifier;
std::stringstream::pos_type pos = sstr.tellg();
sstr >> identifier;
if( identifier == "SomeKeyword") {
    // process the the rest of the string stream using method 1
} else {
   sstr.clear();
   sstr.seekg(pos, sstr.beg);
   // process the stream using method 2
}

The page on tellg at cplusplus.com has a very nice example. The purpose of calling clear() is to ensure that seekg works even if the previous read reached end-of-file. This is only necessary for versions of C++ before C++ 11. If you are using C++11 or newer, seekg clears the EOF bit automatically and you should not include the line with clear() in your solution. Thanks to @metal for pointing this out.

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
  • 1
    Just a suggestion: Use `auto`. – Deduplicator Dec 02 '15 at 19:18
  • @Deduplicator `auto` is only available since C++11. There is no guarantee that OP is using it and it really doesn't add much. – Mad Physicist Dec 02 '15 at 19:19
  • 2
    Well, it would have kept you from using a wrong type on first try, at least. That's enough to use it. – Deduplicator Dec 02 '15 at 19:20
  • 2
    Note: As of C++11, `seekg()` clears the EOF bit before doing its thing, but pre-C++11, it inhibits the seek. So older platforms beware. There's also a potential efficiency problem in the "re-read" approach if the next item is a 10k character block of MIME data. – metal Dec 03 '15 at 22:18
  • @metal Duly noted. The workaround would be to manually unset the bit before doing the seek, which I have edited into my answer. The OP is implying that his string will always contain data after the identifier, but of course you are right that percautions should be taken. I still prefer my answer and Remy's over Johnathan's because it applies to all istreams rather than just stringstreams. Generality at the cost of efficiency. – Mad Physicist Dec 04 '15 at 03:50
2

You can directly inspect a stringstream's contents. That may be a clearer approach than extracting and rolling back, as you aren't guaranteed your stringstreams condition after extraction. For example, if your string only contained one word, extracting it would have set the ios_base::iostate::eofbit flag.

You could accomplish inspecting the stringstream's contents like this:

if(sstr.str().compare(0, identifier.length(), identifier) == 0) {
    sstr.ignore(identifier.length());
    // process the the rest of the string stream using method 1
} else {
    // process the stream using method 2
}

One risk this takes on is, if you were depending upon the stringstream's extraction operator to eliminate leading white-space you'll need to purge before doing the compare. This can be done by before your if-block with the command sstr >> skipws;.

While I do consider this method safer, it should be noted that if you are dependent upon leading white-space being in sstr for "method 2" then you should use one of the other answers (but you should also reconsider your use of stringstream since all the extraction operators first eat white-space.)

Jonathan Mee
  • 37,899
  • 23
  • 129
  • 288
  • 2
    looks at user's score ... o.0 – default Dec 03 '15 at 19:40
  • 1
    @pauld I answered this one question about how to display hearts in a game and voila! – Jonathan Mee Dec 03 '15 at 19:50
  • 2
    This solution identifies a potential flaw in other answers with regard to EOF. (Note: `seekg()` clears the EOF bit before doing its thing, as of C++11, but pre-C++11, it inhibits the seek. There's also a potential efficiency problem in the "re-read" approach if the next item is a 10k character block of MIME data.) Nonetheless, I cannot endorse the conversion here of the *whole stream* to a `std::string` and then comparing part of it -- what if it's 100k of MIME data? That's an expensive conversion just for a compare. Also, this solution takes too much mental parsing for my liking. – metal Dec 03 '15 at 20:18
  • @metal I'd like to say that the compiler will protect us from this, but that's not guaranteed. I'd also like to say that whenever [`string_view`](http://stackoverflow.com/q/29007753/2642059) becomes a reality that this will be included. I was thinking at pulling directly from the buffer to avoid the incursion of the `string` creation, but it's hard to guarantee that the buffer has been filled :( Any better suggestions from you? – Jonathan Mee Dec 03 '15 at 20:55
  • 3
    Instead of using `str().compare()` for peeking, you might try using `rdbuf()->sgetn()` instead to read `identifier.length()` number of characters into a local buffer without advancing the stream position, and then compare that buffer. – Remy Lebeau Dec 04 '15 at 00:16
  • You are not allocating any memory for `prefix`. You are passing an uninitialized pointer to `sgetn()` for it to write to. `prefix` needs to be at least `identifier.length()` characters in size. Per [documentation](http://en.cppreference.com/w/cpp/io/basic_streambuf/sgetn): "*Reads count characters from the input sequence and **stores them into a character array** pointed to by s*". – Remy Lebeau Dec 06 '15 at 15:50
  • @RemyLebeau Ugh, thanks wrong link. What I was actually trying to do was use a `char[]` for the first argument. Strangely everything goes haywire when I do this. I can do a `strcpy` into a `char[]` no problem, why should this be any different? http://ideone.com/ROrfxs – Jonathan Mee Dec 06 '15 at 18:37
  • @RemyLebeau Finally got to an IDE and found the problem. It looks like `sgetn` doesn't add a trailing `'\0'` character. Which is... strange. Anyway, this fixes it: http://ideone.com/QKGQlY – Jonathan Mee Dec 07 '15 at 13:20
  • @RemyLebeau Asked for clarification on why the string is not null terminated here: http://stackoverflow.com/questions/34135565/sgetn-doesnt-null-terminate-string – Jonathan Mee Dec 07 '15 at 14:10