13

I would like to use the facilities provided by stringstream to extract values from a fixed-format string as a type-safe alternative to sscanf. How can I do this?

Consider the following specific use case. I have a std::string in the following fixed format:

YYYYMMDDHHMMSSmmm

Where:

YYYY = 4 digits representing the year
MM = 2 digits representing the month ('0' padded to 2 characters)
DD = 2 digits representing the day ('0' padded to 2 characters)
HH = 2 digits representing the hour ('0' padded to 2 characters)
MM = 2 digits representing the minute ('0' padded to 2 characters)
SS = 2 digits representing the second ('0' padded to 2 characters)
mmm = 3 digits representing the milliseconds ('0' padded to 3 characters)

Previously I was doing something along these lines:

string s = "20101220110651184";
unsigned year = 0, month = 0, day = 0, hour = 0, minute = 0, second = 0, milli = 0;    
sscanf(s.c_str(), "%4u%2u%2u%2u%2u%2u%3u", &year, &month, &day, &hour, &minute, &second, &milli );

The width values are magic numbers, and that's ok. I'd like to use streams to extract these values and convert them to unsigneds in the interest of type safety. But when I try this:

stringstream ss;
ss << "20101220110651184";
ss >> setw(4) >> year;

year retains the value 0. It should be 2010.

How do I do what I'm trying to do? I can't use Boost or any other 3rd party library, nor can I use C++0x.

John Dibling
  • 99,718
  • 31
  • 186
  • 324

6 Answers6

7

One not particularly efficient option would be to construct some temporary strings and use a lexical cast:

std::string s("20101220110651184");
int year = lexical_cast<int>(s.substr(0, 4));
// etc.

lexical_cast can be implemented in just a few lines of code; Herb Sutter presented the bare minimum in his article, "The String Formatters of Manor Farm."

It's not exactly what you're looking for, but it's a type-safe way to extract fixed-width fields from a string.

James McNellis
  • 348,265
  • 75
  • 913
  • 977
  • 1
    I can't use `lexical_cast`, as that's part of Boost. – John Dibling Dec 28 '10 at 19:12
  • Although I could use streams again or some `atoi` type stuff. I was hoping I could accomplish this in a more natural way, though. – John Dibling Dec 28 '10 at 19:12
  • @John: You can write your own quite easily. I've linked to one of Herb Sutter's articles where a very basic implementation is presented (seven nicely formatted lines of code). Or, I posted a very simple version in [my first Stack Overflow post](http://stackoverflow.com/questions/1528374/how-can-i-extend-a-lexical-cast-to-support-enumerated-types); that one is two lines of code. – James McNellis Dec 28 '10 at 19:14
5

Erm, if it's fixed format, why don't you do this?

  std::string sd("20101220110651184");
  // insert spaces from the back
  sd.insert(14, 1, ' ');
  sd.insert(12, 1, ' ');
  sd.insert(10, 1, ' ');
  sd.insert(8, 1, ' ');
  sd.insert(6, 1, ' ');
  sd.insert(4, 1, ' ');
  int year, month, day, hour, min, sec, ms;
  std::istringstream str(sd);
  str >> year >> month >> day >> hour >> min >> sec >> ms;
Nim
  • 33,299
  • 2
  • 62
  • 101
  • You're basically creating a new space-delimited string which the >> operator can parse because it contains spaces... Not very efficient. – BHS May 09 '13 at 00:34
4

I use the following, it might be useful for you:

template<typename T> T stringTo( const std::string& s )
   {
      std::istringstream iss(s);
      T x;
      iss >> x;
      return x;
   };

template<typename T> inline std::string toString( const T& x )
   {
      std::ostringstream o;
      o << x;
      return o.str();
   }

These templates require:

#include <sstream>

Usage

long date;
date = stringTo<long>( std::cin );

YMMV

Jay
  • 13,803
  • 4
  • 42
  • 69
  • In the `stringTo` function, it is very important to check the state of `iss` after the extraction to ensure it succeeded and handle errors appropriately (throw an exception, return an error code, abort the application, whatever). – James McNellis Dec 28 '10 at 19:16
  • +1 this is, at its core, basically what @James suggests above. I was hoping to use something already provided by the StdLib, but I may have to write it myself – John Dibling Dec 28 '10 at 19:17
1

From here, you might find this useful:

template<typename T, typename charT, typename traits>
std::basic_istream<charT, traits>&
  fixedread(std::basic_istream<charT, traits>& in, T& x)
{
  if (in.width(  ) == 0)
    // Not fixed size, so read normally.
    in >> x;
  else {
    std::string field;
    in >> field;
    std::basic_istringstream<charT, traits> stream(field);
    if (! (stream >> x))
      in.setstate(std::ios_base::failbit);
  }
  return in;
}

setw() only applies to reading in of strings cstrings. The above function use this fact, reading into a string and then casting it to the required type. You can use it in combination with setw() or ss.width(w) to read in a fixed-width field of any type.

moinudin
  • 134,091
  • 45
  • 190
  • 216
0
template<typename T>
struct FixedRead {
    T& content;
    int size;
    FixedRead(T& content, int size) :
            content(content), size(size) {
        assert(size != 0);
    }
    template<typename charT, typename traits>
    friend std::basic_istream<charT, traits>&
    operator >>(std::basic_istream<charT, traits>& in, FixedRead<T> x) {
        int orig_w = in.width();
        std::basic_string<charT, traits> o;
        in >> setw(x.size) >> o;
        std::basic_stringstream<charT, traits> os(o);
        if (!(os >> x.content))
            in.setstate(std::ios_base::failbit);
        in.width(orig_w);
        return in;
    }
};

template<typename T>
FixedRead<T> fixed_read(T& content, int size) {
    return FixedRead<T>(content, size);
}

void test4() {
    stringstream ss("20101220110651184");
    int year = 0, month = 0, day = 0, hour = 0, min = 0, sec = 0, ms = 0;
    ss >> fixed_read(year, 4) >> fixed_read(month, 2) >> fixed_read(day, 2)
            >> fixed_read(hour, 2) >> fixed_read(min, 2) >> fixed_read(sec, 2)
            >> fixed_read(ms, 4);
    cout << "year:" << year << "," << "month:" << month << "," << "day:" << day
            << "," << "hour:" << hour << "," << "min:" << min << "," << "sec:"
            << sec << "," << "ms:" << ms << endl;
}
ps5mh
  • 1
  • 1
0

The solution of ps5mh is really nice, but does not work for fixed-size parsing of strings that include white spaces. The following solution fixes this:

template<typename T, typename T2>
struct FixedRead
{
    T& content;
    T2& number;
    int size;
    FixedRead(T& content, int size, T2 & number) :
        content(content), number(number), size(size)
    {
        assert (size != 0);
    }
    template<typename charT, typename traits>
    friend std::basic_istream<charT, traits>&
    operator >>(std::basic_istream<charT, traits>& in, FixedRead<T,T2> x)
    {
        if (!in.eof() && in.good())
        {
            std::vector<char> buffer(x.size+1);
            in.read(buffer.data(), x.size);
            int num_read = in.gcount();
            buffer[num_read] = 0; // set null-termination of string
            std::basic_stringstream<charT, traits> os(buffer.data());
            if (!(os >> x.content))
                in.setstate(std::ios_base::failbit);
            else
                ++x.number;
        }
        return in;
    }
};
template<typename T, typename T2>
FixedRead<T,T2> fixedread(T& content, int size, T2 & number) {
    return FixedRead<T,T2>(content, size, number);
}

This can be used as:

std::string s  = "90007127       19000715790007397";
std::vector<int> ints(5);
int num_read = 0;
std::istringstream in(s);
in >> fixedread(ints[0], 8, num_read) 
   >> fixedread(ints[1], 8, num_read) 
   >> fixedread(ints[2], 8, num_read) 
   >> fixedread(ints[3], 8, num_read) 
   >> fixedread(ints[4], 8, num_read);
// output: 
//   num_read = 4 (like return value of sscanf)
//   ints = 90007127, 1, 90007157, 90007397
//   ints[4] is uninitialized