2

Is there a way to find the start position of tokens extracted by istringstream::operator >>?

For example, my current failed attempt at checking tellg() (run online):

string test = "   first     \"  in \\\"quotes \"  last";
istringstream strm(test);

while (!strm.eof()) {

    string token;
    auto startpos = strm.tellg();
    strm >> quoted(token);
    auto endpos = strm.tellg();
    if (endpos == -1) endpos = test.length();

    cout << token << ": " << startpos << " " << endpos << endl;

}

So the output of the above program is:

first: 0 8
  in "quotes : 8 29
last: 29 35

The end positions are fine, but the start positions are the start of the whitespace leading up to the token. The output I want would be something like:

first: 3 8
  in "quotes : 13 29
last: 31 35

Here's the test string with positions for reference:

          1111111111222222222233333
01234567890123456789012345678901234  the end is -1

   first     "  in \"quotes "  last

        ^--------------------^-----^ the end positions i get and want
^-------^--------------------^------ the start positions i get
   ^---------^-----------------^---- the start positions i *want*

Is there any straightforward way to retrieve this information when using an istringstream?

Jason C
  • 38,729
  • 14
  • 126
  • 182
  • 2
    Please clarify... You have provided example output from your program, and then you've shown an example of the output you want to see, which is identical to the first example. – paddy Dec 01 '21 at 02:22
  • @paddy Ah, collateral damage from an edit. Thanks for spotting that. One moment.... Fixed. Sorry about that. – Jason C Dec 01 '21 at 02:23

1 Answers1

2

First, see Why is iostream::eof inside a loop condition (i.e. `while (!stream.eof())`) considered wrong?

Second, you can use the std::ws stream manipulator to swallow whitespace before reading the next token value, then tellg() will report the start positions you are looking for, eg:

#include <string>
#include <sstream>
#include <iomanip>
using namespace std;

...

string test = "   first     \"  in \\\"quotes \"  last";
istringstream strm(test);

while (strm >> ws) {

    string token;
    auto startpos = strm.tellg();
    if (!(strm >> quoted(token)) break;
    auto endpos = strm.tellg();
    if (endpos == -1) endpos = test.length();

    cout << token << ": " << startpos << " " << endpos << endl;
}

Online Demo

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • Perfect. Thanks. I figured it'd be an easy job. I'm just too burnt to figure it out. – Jason C Dec 01 '21 at 02:26
  • 1
    @JasonC updated to handle that failure – Remy Lebeau Dec 01 '21 at 02:30
  • I could've sworn there was a `>>` to grab the current position; that'd smooth it all out. I might be remembering some other language or maybe I'm thinking of `scanf`. One sec, let's see.... – Jason C Dec 01 '21 at 02:31
  • 1
    @JasonC [not that I see](https://en.cppreference.com/w/cpp/io/manip). Maybe you are thinking of [`std::showpos`](https://en.cppreference.com/w/cpp/io/manip/showpos)? But that is for an entirely different purpose. Yeah, `scanf()` has `%n`. Should not be difficult to create a custom stream manipulator to output the stream's current position, though. – Remy Lebeau Dec 01 '21 at 02:32
  • [Bah](https://c.tenor.com/BFPaMElzoxkAAAAC/curse-aukerman.gif). Easy one to toss together though, I guess. – Jason C Dec 01 '21 at 02:33
  • 1
    I've really enjoyed watching your last comment slowly evolve, btw. I remember when it was just a young "nope". Ah, nostalgia. 5 minutes ago... seems like only yesterday. Time sure does fly by. – Jason C Dec 01 '21 at 02:37