7

I am trying to write a custom std::ostream that invokes a function for every line written to it. That is, I would like the following code to work as documented in comments:

my_output_stream s([] (const std::string& line) { 
    WriteLineToSomeOutput(line); 
});

s << "Hello world"; // do not invoke anything. The line has not ended yet!
s << ", from Alex" << std::endl; // here we invoke WriteLineToSomeOutput("hello world, from Alex")
s << "What's up"; // do not invoke anything. The line has not ended yet.
s << ", doc?!\nHow u doing?\n"; // Now we have two lines. We invoke WriteLineToSomeOutput("What's up, doc?!) and WriteLineToSomeOutput("How u doing?")

Note that the data is not written anywhere and not stored anywhere. The only thing that I need the stream to store is the current line that is being aggregated, until we encounter an end of line.

I did not find any easy way of doing so, even when using the boost.Iostreams library. Can I avoid writing my own line tokenizer here by using some built-in facilities of STL and Boost?

Background

The my_output_stream class will be used to adapt between an external library and a logging library used in my application. The external library requires that I provide an std::ostream. And I want to log each line that is logged by the external library using my application's logging framework.

Alex Shtoff
  • 2,520
  • 1
  • 25
  • 53
  • Did you try overloading `my_output_stream::operator<<` ? – quantdev Jul 28 '14 at 10:51
  • 1
    Of course. The external library uses the base class - std::ostream. Overloading does not help here, as it does not know about my own stream class and cannot invoke the operator overloaded for my own stream class. In addition, the way that std::ostream is used is not documented, so I cannot rely on any empirical knowledge. It might not persist when the library is upgraded. – Alex Shtoff Jul 28 '14 at 10:53
  • You might have some success with a custom stream buffer. basic_streambuf has a public member function `sputn()`, used by the basic_ostream. This public member function calls a protected virtual member, `xsputn()`, which you can override in a derived class. You should be able to use this to grab the contents of the stream and log them as appropriate, then pass a std::ostream object which has this custom streambuf set. http://en.cppreference.com/w/cpp/io/basic_streambuf and http://en.cppreference.com/w/cpp/io/basic_streambuf/sputn may help. – Andrew Jul 28 '14 at 11:07
  • @Andrew Much of the output will probably be done using `std::streambuf::sputc`, not `sputn`. `sputc` calls `overflow` when the buffer is full, but it's up to you to set up a buffer, either in the constructor or in `overflow`; until you do so, the buffer is always considered full, so you can intercept every character if you want. – James Kanze Jul 28 '14 at 12:12
  • Is it so hard to do a `memchr`? No, there isn't anything like this in `std`, and not that I know of anything in `boost`. – Yakov Galka Jul 28 '14 at 12:22
  • @JamesKanze, either way, this can be done by swapping out the regular streambuf for a custom one. I didn't say it'd be simple, though :) – Andrew Jul 28 '14 at 12:28
  • 2
    @Andrew The only real solution involves using a custom streambuf. But the function you have to override is `overflow`. You can also override `xsputn` if you want, but if you don't, it will just make successive calls to `overflow`. – James Kanze Jul 28 '14 at 13:12

3 Answers3

3

If I understand correctly, you want to unconditionally flush at end of line, and only at end of line. To do this, you must implement your own streambuf; it could be based on std::stringbuf, but if you're only concerned with output, and not worried about seeking, it's probably just as easy to do it yourself.

Something like the following should do the trick:

class LineBufferedOutput : public std::streambuf
{
    std::vector<char> myBuffer;
protected:
    int overflow( int ch ) override
    {
        myBuffer.push_back( ch );
        if ( ch == '\n' ) {
            //   whatever you have to do...
        }
        //  return traits::eof() for failure...
    }
};

I'm not sure what you mean by implementing your own tokenizer; there's no tokenization involved. You do have to look at each character, in order to compare it to '\n', but that's all.

And you ignore any explicit requests to sync().

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • Confused by when overflow would be called.. according to cpp reference it is only called when "This function is called by public member functions of its base class such as sputc to put a character when there are no writing positions available at the put pointer (pptr)." [http://www.cplusplus.com/reference/fstream/filebuf/overflow/](check here) – Dong Li Oct 17 '18 at 06:11
0

I would probably start by implementing a streamable device and then wrapping that in a boost::iostream. Take a look at boost::iostreams and use this as a starter:

#include <iosfwd>                           // streamsize, seekdir
#include <boost/iostreams/categories.hpp>   // seekable_device_tag
#include <boost/iostreams/positioning.hpp>  // stream_offset

#include <boost/function.hpp>

class MyDevice
{

  public:
    typedef boost::function<void()> Callback; // or whatever the signature should be
    typedef char                                   char_type;
    typedef boost::iostreams::seekable_device_tag  category;

    explicit MyDevice(Callback &callback);

    std::streamsize read(char* s, std::streamsize n);
    std::streamsize write(const char* s, std::streamsize n);
    std::streampos seek(boost::iostreams::stream_offset off, std::ios_base::seekdir way);

  private:
    MyDevice();
    Callback myCallback;
};

That would be the basic declaration. You will need to define in your .cpp file how each function is implemented. One of these functions might be implemented as follows:

std::streampos MyDevice::write(const char* s, std::streamsize n)
{
    // process written data
    // file callback
    myCallback();
    // etc
}

Then to use from elsewhere, e.g. in your main function:

Callback callback; // some function
MyDevice device(callback);
boost::iostreams::stream<MyDevice> stream(device);
stream << data; // etc.
Ben J
  • 1,367
  • 2
  • 15
  • 33
  • Well, as I wrote in the question I posted, I want to avoid tokenizing the lines by myself... especially taking care of the \n and \n\r stuff. The suggested solution will have to tokenize the string inside MyDevice::write. – Alex Shtoff Jul 28 '14 at 11:11
0

It appears you are just looking for line-buffering, and standard ostream already does that (unless you specifically request it not to using either std::unitbuf or std::flush manipulators).

Now, long lines could overflow the output buffer and trigger an "early" flush, but keep in mind, this can happen anyways: if output is to a file, the OS will apply the same kind of buffering strategies and it doesn't really matter in which chunks you flush long lines.

If output is to a socket, e.g. then you may send the buffer in one go (if you take good care to make sure the buffer is large enough) but the TCP/IP layer is free to break the stream up in packages depending on tuning and limits of your networking hardware.

sehe
  • 374,641
  • 47
  • 450
  • 633
  • A standard ostream will not invoke my own function for every line, and as you said may spontaneously flush() if the line is too long. Is there a way to write a standard ostream that will **not** store its output anywhere and instead will direct it to a custom callback function? – Alex Shtoff Jul 28 '14 at 11:18
  • @Alex If you write a custom streambuf and override `overflow()`, yes – sehe Jul 28 '14 at 11:19
  • Note that, in this case, it looks more as if you just want to implement a (virtual) stream with a custom backend, of which there are /many many/ samples, rather than a 'signalling' standard stream? – sehe Jul 28 '14 at 11:21
  • It is quite equivalent. The custom backend can either signal to a callback, or directly call the function I want. It does not matter. What matters is that the function is called once for each line. – Alex Shtoff Jul 28 '14 at 11:23
  • http://en.cppreference.com/w/cpp/io/basic_streambuf/overflow is your friend then. And you can search SO for examples doing this. (By the way, it is _not_ equivalent, because in one case the `streambuf` _also_ does it's usual work, in the other case, all processing can be in the trigger and the streambuf doesn't have to have any other side effects) – sehe Jul 28 '14 at 11:30
  • Standard ostream does _not_ line buffer. There is, in fact, no line buffering in C++. – James Kanze Jul 28 '14 at 12:13
  • @JamesKanze It happens at least on the OS level. Otherwise `setvbuf`, `unbuffer` and `std::ios::unitbuf` wouldn't be a thing, right? I think you're right for vanilla `basic_stream` – sehe Jul 28 '14 at 12:57
  • @sehe Line buffering _doesn't_ happen, at least not with any of the `ostream` I've used. Regardless of the level. `setvbuf` only affects `FILE*` output, and there is no `unbuffer` function. `unitbuf` flushes after every `<<`, which isn't what he's looking for either. – James Kanze Jul 28 '14 at 13:15
  • @JamesKanze: untrue. the standard iostreams (cout in particular) when synchronized with stdio (the default) are said to work as if writing each character with `fputc` to the corresponding C `FILE*` stream. `stdout` is by default line-buffered per C standard. In practice I actually experience line-buffering on my system (clang with libc++, FreeBSD 10.0). – Yakov Galka Jul 29 '14 at 06:43
  • @JamesKanze No one suggested that unitbuf is what the OP is after. However, it _exists for a reason_. Also: [`unbuffer`](http://linuxcommand.org/man_pages/unbuffer1.html) – sehe Jul 29 '14 at 07:01
  • @ybungalobill According to the C standard, standard input and output may be fully buffered if the output devise is not interactive. According to the C++, synchronization only affects the standard streams (which aren't in question here), and _according to the C++ standard_, it is the `FILE*` functions whose behavior should adopt to that of the streams, and not vice versa. (But since all this only concerns the standard stream objects anyway, and there's no mention of them here, it's all irrelevant to the question.) – James Kanze Jul 29 '14 at 08:04
  • @sehe `unitbuf` does exist for a reason; it's just irrelevant here. And there is no function `unbuffer` in either C or C++; the page you cite seems like the man page for a utility program for some unknown Unix-like. It's not Posix, at any rate. – James Kanze Jul 29 '14 at 08:07
  • 1
    @JamesKanze Sigh. No one claimed `unbuffer` is library function, that it's standard _either_... Very clearly, std::cout and ofstream are being line-buffered by default on many implementations (and _not_ just when using `std::endl`). You do with that information whatever you need, put your fingers in your ears if you don't want to hear about it because it might not standard-specified. I don't see why I need to justify anything there. – sehe Jul 29 '14 at 08:28
  • "It appears you are just looking for line-buffering, and standard ostream already does that": the posted answer starts with a false statement (and none of the implementations I've seen do line-buffering on anything but the standard streams, and not always then). I don't know where `unbuffer` came in, but it is clearly not relevant to the OP's problem either (and probably not available). – James Kanze Jul 29 '14 at 08:50
  • @JamesKanze Do I mention it in the answer? No. Why do you think it should be relevant to the OP's problem? I use the comments to discuss, and the answer to, well, answer. And, indeed I assumed things when trying to understand what the OP meant. I stated my assumption right away. Your answer (interestingly, doing what I suggested) already has my upvote for ages. I don't see what the big issue is. – sehe Jul 29 '14 at 08:56