8

Is there a canonical / public / free implementations variant of std::stringstream where I don't pay for a full string copy each time I call str()? (Possibly through providing a direct c_str() member in the osteam class?)

I've found two questions here:

And "of course" the deprecated std::strstream class does allow for direct buffer access, although it's interface is really quirky (apart from it being deprecated).

It also seems one can find several code samples that do explain how one can customize std::streambuf to allow direct access to the buffer -- I haven't tried it in practice, but it seems quite easily implemented.

My question here is really two fold:

  • Is there any deeper reason why std::[o]stringstream (or, rather, basic_stringbuf) does not allow direct buffer access, but only access through an (expensive) copy of the whole buffer?
  • Given that it seems easy, but not trivial to implement this, is there any varaint available via boost or other sources, that package this functionality?

Note: The performance hit of the copy that str() makes is very measurable(*), so it seems weird to have to pay for this when the use cases I have seen so far really never need a copy returned from the stringstream. (And if I'd need a copy I could always make it at the "client side".)


(*): With our platform (VS 2005), the results I measure in the release version are:

// tested in a tight loop:

// variant stream: run time : 100%
std::stringstream msg;
msg << "Error " << GetDetailedErrorMsg() << " while testing!";
DoLogErrorMsg(msg.str().c_str());

// variant string: run time: *** 60% ***
std::string msg;
((msg += "Error ") += GetDetailedErrorMsg()) += " while testing!";
DoLogErrorMsg(msg.c_str());

So using a std::string with += (which obviously only works when I don't need custom/number formatting is 40% faster that the stream version, and as far as I can tell this is only due to the complete superfluous copy that str() makes.

Community
  • 1
  • 1
Martin Ba
  • 37,187
  • 33
  • 183
  • 337
  • Can't you use a custom allocator to get around this or is it still going to copy? I have allocators lying around. Should probably test it :l – Brandon Feb 27 '14 at 19:40
  • I wouldn't be surprised if a lot of the overhead was elsewhere. If you're creating the stream inside the loop, that can add more than you'd like. A stream also uses a locale for most formatting, which adds quite a bit of overhead for a simple operation like concatenating a few strings. – Jerry Coffin Feb 27 '14 at 19:50
  • @MartinBa: Yeah--problem is that if you don't use the result, VC++ is pretty good at eliminating all (or at least most of) the calculation you did, so failing to use the result *probably* removed most of the formatting and such as well. – Jerry Coffin Feb 27 '14 at 20:02
  • I don't know about Boost but to answer your first question: It's an amendment from the traditional IOStreams. The old deprecated class of streams `std::strstream` had an `str()` method that returned a pointer to its internal buffer. To protect against invalidation of the pointer, the stream had to be "frozen", meaning the buffer could not be resized after accessing it. The Standard IOStreams returns a copy of the buffer as a `std::basic_string` object so that freezing the stream is no longer necessary. – David G Feb 27 '14 at 20:06
  • @JerryCoffin: I re-measured. `str()` *is* measureable, but stringstream is *still* slower even without it. What a mess. I guess the question still holds though, as without the copy behavior of `str()` it would be slightly less crappy. – Martin Ba Feb 27 '14 at 20:46

2 Answers2

1

I will try to provide an answer to my first bullet,

Is there any deeper reason why std::ostringstream does not allow direct buffer access

Looking at how a streambuf / stringbuf is defined, we can see that the buffer character sequence is not NULL terminated.

As far as I can see, a (hypothetical) const char* std::ostringstream::c_str() const; function, providing direct read-only buffer access, can only make sense when the valid buffer range would always be NULL terminated -- i.e. (I think) when sputc would always make sure that it inserts a terminating NULL after the character it inserts.

I wouldn't think that this is a technical hindrance per se, but given the complexity of the basic_streambuf interface, I'm totally not sure whether it is correct in all cases.

Martin Ba
  • 37,187
  • 33
  • 183
  • 337
1

As for the second bullet

Given that it seems easy, but not trivial to implement this, is there any variant available via boost or other sources, that package this functionality?

There is Boost.Iostreams and it even contains an example of how to implement an (o)stream Sink with a string.

I came up with a little test implementation to measure it:

#include <string>
#include <boost/iostreams/stream.hpp>
#include <libs/iostreams/example/container_device.hpp> // container_sink

namespace io = boost::iostreams;
namespace ex = boost::iostreams::example;
typedef ex::container_sink<std::wstring> wstring_sink;
struct my_boost_ostr : public io::stream<wstring_sink> {
    typedef io::stream<wstring_sink> BaseT;
    std::wstring result;
    my_boost_ostr() : BaseT(result)
    { }

    // Note: This is non-const for flush.
    // Suboptimal, but OK for this test.
    const wchar_t* c_str() {
        flush();
        return result.c_str();
    }
};

In the tests I did, using this with it's c_str()helper ran slightly faster than a normal ostringstream with it's copying str().c_str() version.

I do not include measuring code. Performance in this area is very brittle, make sure to measure your use case yourself! (For example, the constructor overhead of a string stream is non-negligible.)

Martin Ba
  • 37,187
  • 33
  • 183
  • 337