10

I have a named std::string that I want to fill with data via an std::ostream interface and avoid a string copy.
One way to do it which does involve a copy is to do this:

bool f(std::string& out)
{
   std::ostringstream ostr;
   fillWithData(ostr);
   out = ostr.str(); // 2 copies here
   return true;
}

I need to pass the result through out and cannot return ostr.str().
I want to avoid the copies in out = ostr.str(); since this string may be very big.

Is there some way, maybe using rdbuf()s, to bind the std::ostream buffer directly to out?

To clarify, I am interested in the auto-expanding behaviour of std::string and std::ostream so that the caller does not have to know the size before the call.

UPDATE: I just realized that the innocuous line out = ostr.str(); will probably entail 2 copies:

  1. The first by the str() call
  2. The other by the std::string assignment operator.
Adi Shavit
  • 16,743
  • 5
  • 67
  • 137
  • 2
    Does not look possible. Could your function `f` take an `std::ostringstream &` as argument? – Didier Trosset Apr 30 '14 at 13:49
  • 1
    The problem with what you want to do is that the C++ standard does not require the *std::string_buf*'s internal buffer be a *std::string* – indeterminately sequenced Apr 30 '14 at 13:52
  • @DidierTrosset: Not really. The internal usage of `std::ostringstream` is an implementation detail. – Adi Shavit Apr 30 '14 at 14:03
  • 2
    Note: The second copy (assignment) is likely optimized (should be no issue) –  May 01 '14 at 12:53
  • Similar thread: https://stackoverflow.com/questions/26266525/move-the-string-out-of-a-stdostringstream – M.M Jul 30 '18 at 10:15
  • 1
    There is no second copy. In every compiler worth its salt that copy is elided. Beginning C++11, even if it wasn't, the string is moved. Beginning C++17, the copy is elided. `ostringstream` doesn't support creating a string reusing its memory, not with the current standard. – ytoledano Jun 25 '19 at 18:25

3 Answers3

6

Write your own stream:

#include <ostream>

template <typename Char, typename Traits = std::char_traits<Char>>
class BasicStringOutputBuffer : public std::basic_streambuf<Char, Traits>
{
    // Types
    // =====

    private:
    typedef std::basic_streambuf<Char, Traits> Base;

    public:
    typedef typename Base::char_type char_type;
    typedef typename Base::int_type int_type;
    typedef typename Base::pos_type pos_type;
    typedef typename Base::off_type off_type;
    typedef typename Base::traits_type traits_type;

    typedef typename std::basic_string<char_type> string_type;

    // Element Access
    // ==============

    public:
    const string_type& str() const  { return m_str; }
    string_type& str() { return m_str; }

    // Stream Buffer Interface
    // =======================

    protected:
    virtual std::streamsize xsputn(const char_type* s, std::streamsize n);
    virtual int_type overflow(int_type);

    // Utilities
    // =========

    protected:
    int_type eof() { return traits_type::eof(); }
    bool is_eof(int_type ch) { return ch == eof(); }

    private:
    string_type m_str;
};

// Put Area
// ========

template < typename Char, typename Traits>
std::streamsize
BasicStringOutputBuffer<Char, Traits>::xsputn(const char_type* s, std::streamsize n) {
    m_str.append(s, n);
    return n;
}

template < typename Char, typename Traits>
typename BasicStringOutputBuffer<Char, Traits>::int_type
BasicStringOutputBuffer<Char, Traits>::overflow(int_type ch)
{
    if(is_eof(ch)) return eof();
    else {
        char_type c = traits_type::to_char_type(ch);
        return xsputn(&c, 1);
    }
}


// BasicStringOutputStream
//=============================================================================

template < typename Char, typename Traits = std::char_traits<Char> >
class BasicStringOutputStream : public std::basic_ostream<Char, Traits>
{
    protected:
    typedef std::basic_ostream<Char, Traits> Base;

    public:
    typedef typename Base::char_type char_type;
    typedef typename Base::int_type int_type;
    typedef typename Base::pos_type pos_type;
    typedef typename Base::off_type off_type;
    typedef typename Base::traits_type traits_type;
    typedef typename BasicStringOutputBuffer<Char, Traits>::string_type string_type;

    // Construction
    // ============

    public:
    BasicStringOutputStream()
    :   Base(&m_buf)
    {}

    // Element Access
    // ==============

    public:
    const string_type& str() const { return m_buf.str(); }
    string_type& str() { return m_buf.str(); }

    private:
    BasicStringOutputBuffer<Char, Traits> m_buf;
};

typedef BasicStringOutputStream<char> StringOutputStream;


// Test
// ====

#include <iostream>

int main() {
    StringOutputStream stream;
    stream << "The answer is " << 42;
    std::string result;
    result.swap(stream.str());
    std::cout << result << '\n';

}

Note: You might manage the put area pointers in a more sophisticated implementation.

  • +1 for the comprehensive solution. Wouldn't Boost iostreams make the code shorter and less bug-prone? – Adi Shavit May 01 '14 at 06:48
  • @AdiShavit I think boost::iostreams are convenience wrappers not providing additional functionality (but I might be wrong) –  May 01 '14 at 07:12
0

Here's my custom stream buffer solution from https://stackoverflow.com/a/51571896/577234. It's much shorter than Dieter's - only need to implement overflow(). It also has better performance for repeated ostream::put() by setting up buffers. Performance for large writes using ostream::write() will be the same since it calls xsputn() instead of overflow().

class MemoryOutputStreamBuffer : public streambuf
{
public:
    MemoryOutputStreamBuffer(vector<uint8_t> &b) : buffer(b)
    {
    }
    int_type overflow(int_type c)
    {
        size_t size = this->size();   // can be > oldCapacity due to seeking past end
        size_t oldCapacity = buffer.size();

        size_t newCapacity = max(oldCapacity + 100, size * 2);
        buffer.resize(newCapacity);

        char *b = (char *)&buffer[0];
        setp(b, &b[newCapacity]);
        pbump(size);
        if (c != EOF)
        {
            buffer[size] = c;
            pbump(1);
        }
        return c;
    }
  #ifdef ALLOW_MEM_OUT_STREAM_RANDOM_ACCESS
    streampos MemoryOutputStreamBuffer::seekpos(streampos pos,
                                                ios_base::openmode which)
    {
        setp(pbase(), epptr());
        pbump(pos);
        // GCC's streambuf doesn't allow put pointer to go out of bounds or else xsputn() will have integer overflow
        // Microsoft's does allow out of bounds, so manually calling overflow() isn't needed
        if (pptr() > epptr())
            overflow(EOF);
        return pos;
    }
    // redundant, but necessary for tellp() to work
    // https://stackoverflow.com/questions/29132458/why-does-the-standard-have-both-seekpos-and-seekoff
    streampos MemoryOutputStreamBuffer::seekoff(streamoff offset,
                                                ios_base::seekdir way,
                                                ios_base::openmode which)
    {
        streampos pos;
        switch (way)
        {
        case ios_base::beg:
            pos = offset;
            break;
        case ios_base::cur:
            pos = (pptr() - pbase()) + offset;
            break;
        case ios_base::end:
            pos = (epptr() - pbase()) + offset;
            break;
        }
        return seekpos(pos, which);
    }
#endif    
    size_t size()
    {
        return pptr() - pbase();
    }
private:
    std::vector<uint8_t> &buffer;
};

They say a good programmer is a lazy one, so here's an alternate implementation I came up with that needs even less custom code. However, there's a risk for memory leaks because it hijacks the buffer inside MyStringBuffer, but doesn't free MyStringBuffer. In practice, it doesn't leak for GCC's streambuf, which I confirmed using AddressSanitizer.

class MyStringBuffer : public stringbuf
{
public:
  uint8_t &operator[](size_t index)
  {
    uint8_t *b = (uint8_t *)pbase();
    return b[index];
  }
  size_t size()
  {
    return pptr() - pbase();
  }
};

// caller is responsible for freeing out
void Test(uint8_t *&_out, size_t &size)
{
  uint8_t dummy[sizeof(MyStringBuffer)];
  new (dummy) MyStringBuffer;  // construct MyStringBuffer using existing memory

  MyStringBuffer &buf = *(MyStringBuffer *)dummy;
  ostream out(&buf);

  out << "hello world";
  _out = &buf[0];
  size = buf.size();
}
Yale Zhang
  • 1,447
  • 12
  • 30
0

There is no second copy, because it's the move assignment.

Since C++20, std::ostringstream provides a new member function that can consume itself and return std::string:

std::basic_string<CharT,Traits,Allocator> str() &&;

Therefore, you may avoid the first copy in this way:

bool f(std::string& out)
{
   std::ostringstream ostr;
   fillWithData(ostr);
   out = std::move(ostr).str();
   return true;
}

References

https://en.cppreference.com/w/cpp/io/basic_ostringstream/str

searchstar
  • 63
  • 6