1

I want to make istream consider only tabs as whitespace. So, given "{json : 5}\tblah", I want to load json into obj1 and "blah" into obj2 with code like the following:

is << obj1 << obj2

Is there a way to do this without loading the objects into strings?

Jason Sundram
  • 12,225
  • 19
  • 71
  • 86
John Montague
  • 1,910
  • 7
  • 21
  • 30

2 Answers2

1

Yep in the local set the tab is the only character that has the space attribute.

The hard part: Create a facet that inherits from ctype. Then make sure you set all characters to be not whitespace (except tab).

#include <locale>
#include <fstream>
#include <iostream>
#include <string>
#include <sstream>    

// This is my facet:
// It is designed to treat only <tab> as whitespace
class TabSepFacet: public std::ctype<char>
{
    public:
        typedef std::ctype<char>   base;
        typedef base::char_type    char_type;

        TabSepFacet(std::locale const& l) : base(table)
        {
            // Get the ctype facet of the current locale
            std::ctype<char> const&  defaultCType = std::use_facet<std::ctype<char> >(l);

            // Copy the default flags for each character from the current facet
            static char data[256];
            for(int loop = 0; loop < 256; ++loop) {data[loop] = loop;}
            defaultCType.is(data, data+256, table);

            // Remove the other spaces
            for(int loop = 0; loop < 256; ++loop)
            {
                // If the space flag is set then XOR it out.
                if (table[loop] & base::space)
                {   table[loop] ^= base::space;
                }
            }
            // Only a tab is a space
            table['\t'] |= base::space;
        }
    private:
        base::mask table[256];
};

The easy part: create a locale object that uses the facet and imbue the stream with it:

int main()
{
    // Create a stream (Create the locale) then imbue the stream.
    std::stringstream data("This is a\tTab");
    const std::locale tabSepLocale(data.getloc(), new TabSepFacet(data.getloc()));
    data.imbue(tabSepLocale);

    // Note: If it is a file stream then imbue the stream BEFORE opening a file,
    // otherwise the imbue is silently ignored on some systems.


    // Now you can use the stream like normal; your locale defines what 
    // is whitespace, so the operator `>>` will split on tab.
    std::string   word;
    while(data >> word)
    {
        std::cout << "Word(" << word << ")\n";
    }
}

The result:

> g++ tab.cpp
> ./a.out
Word(This is a)
Word(Tab)

Note: Not even newline is not a whitespace character above. So the operator >> will read across the end of line and ignore it.

Martin York
  • 257,169
  • 86
  • 333
  • 562
  • @Jason Sundram: I have no problem with you fixing the spelling or the formatting of normal text. BUT **don't change the code**. I have a style that I keep consistent across posts that I try and maintain having to fix your changes is a pain. Please don't do it. – Martin York Feb 01 '12 at 20:34
  • The spacing was weird enough that I thought it was accidental. Apologies. – Jason Sundram Feb 01 '12 at 21:55
0

What about std::getline?

getline(getline(std::cin, obj1, '\t'), obj2);
ipc
  • 8,045
  • 29
  • 33
  • 1
    note OP would pass his `std::istream` instead of `std::cin` and nested calls to getline probably should be in a loop. – AJG85 Jan 31 '12 at 19:44