30

For I/O work, I need to read N bytes into a buffer. N is known at run time (not compile time). The buffer size will never change. The buffer is passed to other routines to compress, encrypt, etc: it's just a sequence of bytes, nothing higher than that.

In C, I would allocate the buffer with malloc and then free it when I'm done. However, my code is modern C++, certainly no mallocs in place, and very few raw new and delete: I'm making heavy use of RAII and shared_ptr. None of those techniques seem appropriate for this buffer, however. It's just a fixed length buffer of bytes, to receive I/O and make its contents available.

Is there a modern C++ idiom to this elegantly? Or, for this aspect, should I just stick with good ol' malloc?

SRobertJames
  • 8,210
  • 14
  • 60
  • 107

5 Answers5

21

Basically, you have two main C++-way choices:

  • std::vector
  • std::unique_ptr

I'd prefer the second, since you don't need all the automatic resizing stuff in std::vector, and you don't need a container - you need just a buffer.

std::unique_ptr has a specialization for dynamic arrays: std::unique_ptr<int[]> will call delete [] in it's destructor, and will provide you with appropriate operator [].

If you want the code:

std::unique_ptr<char[]> buffer(new char [size]);
some_io_function(buffer.get(), size); // get() returnes raw pointer

Unfortunatelly, it doesn't have a way to retrieve the size of the buffer, so you'll have to store it in a variable. If it confuses you, then std::vector will do the work:

std::vector<char> buffer(size);
some_io_function(buffer.data(), buffer.size()); // data() returnes raw pointer

If you want to pass the buffer around, it depends on how exactly you do it.

Consider the following case: the buffer is filled somewhere, then processed somewhere else, stored for some time, then written somewhere and destroyed. It happens that you never really need two places in the code to own the buffer, and you can simply std::move it from place to place. For this use case, std::unique_ptr will work perfectly, and will protect you from occasionally copying the buffer (while with std::vector you can copy it by mistake, and no error or warning will arise).

If, conversely, you need several places in the code to hold the same buffer (maybe it is filled / used / processed in more then one place simultaneously), you definitely need std::shared_ptr. Unfortunately, it does not have array-like specialization, so you'll have to pass appropriate deleter:

std::shared_ptr<char> buffer(new char[size], std::default_delete<char[]>());

The third option is if you really need to copy the buffer. Then, std::vector will be simpler. But, as I've already mentioned, I feel that it is not the best way. Also, you can always copy the buffer hold by std::unique_ptr or std::shared_ptr manually, which clearly documents your intention:

std::uniqure_ptr<char[]> buffer_copy(new char[size]);
std::copy(buffer.get(), buffer.get() + size, buffer_copy.get());
lisyarus
  • 15,025
  • 3
  • 43
  • 68
  • 1
    Using `std::unique_ptr` may not be the best alternative for a buffer that is supposed to be reused and passed around to multiple functions. – Some programmer dude Jun 02 '15 at 11:56
  • @JoachimPileborg nothing said in the question about multiple functions. Btw, from my experience, IO buffers are usually not used this way. If it is still needed, then `std::vector` won't work either - you'll have to use `std::shared_ptr`, for example. – lisyarus Jun 02 '15 at 12:02
  • "The buffer is passed to other routines to...". Regarding using a vector, at least it's not designed to be passed around primarily by value. – Some programmer dude Jun 02 '15 at 12:08
  • @JoachimPileborg missed that part, my apologies. Well, you always can move `std::unique_ptr`, and it will also guard you from trying to copy the full buffer (where `std::vector` will happily copy). – lisyarus Jun 02 '15 at 12:16
  • What about using `std::shared_ptr`, since, as @JoachimPileborg mentioned, I need to pass it around? – SRobertJames Jun 02 '15 at 13:02
  • 1
    Didn't know `std::unique_ptr` had a specialization for arrays, nice. – rwols Jun 02 '15 at 13:20
  • Why not a tiny buffer class that tracks the size, and the unique_ptr, that one can then point to using a shared_ptr? – JoeManiaci Jun 05 '17 at 22:07
  • The problem with the `std::unique_ptr` solution is that it still uses a raw dynamic array which doesn't know it's size (a source of bugs) and doesn't provide iterators (a source of bugs) and can't be used in a range based for loop. A vector on the other hand has proper built in copying and moving whereas manually copying a raw array is another source of bugs. It is hard to imagine a case where a `std::vector` would not be the superior choice. – Galik Feb 12 '18 at 11:43
  • @Galik I completely agree with the problems you described. However, as I've said in the answer, `unique_ptr` saves from unintended copyings. Furthermore, it can release the data - something extremely helpful when dealing with plain C API's, for example. Personally, I'd prefer a class with a `unique_ptr` and `size` inside. – lisyarus Feb 12 '18 at 13:01
19

In C++14, there's a very syntactically clean way of achieving what you want:

size_t n = /* size of buffer */;
auto buf_ptr = std::make_unique<uint8_t[]>(n);
auto nr = ::read(STDIN_FILENO, buf_ptr.get(), n);
auto nw = ::write(STDOUT_FILENO, buf_ptr.get(), nr);
// etc.
// buffer is freed automatically when buf_ptr goes out of scope

Note that the above construct will value-initialize (zero out) the buffer. If you want to skip the initialization to save a few cycles, you'll have to use the slightly uglier form given by lisyarus:

std::unique_ptr<uint8_t[]> buf_ptr(new uint8_t[n]);

C++20 introduces std::make_unique_for_overwrite, which allows the non-initializing line above to be written more concisely as:

auto buf_ptr = std::make_unique_for_overwrite<uint8_t[]>(n);
einpoklum
  • 118,144
  • 57
  • 340
  • 684
Matt Whitlock
  • 756
  • 7
  • 10
  • Another huge advantage of Matt's answer is because there is a massive improvement in destruction time. I recently converted some code from using a large vector as a buffer, with destruction time measured in seconds, to a unique_ptr with effectively instant destruction. – Nanki Jun 16 '16 at 08:31
  • 3
    @Nanki That's hard to believe; can you explain what may have caused that? – SRobertJames Jul 25 '16 at 22:07
  • This turned out to be due to a known bug in VS2015 during debug and didn't affect release code - and it's already been fixed. – Nanki Jul 27 '16 at 06:59
  • Doesn't the usage of the scope operator, ::, with nothing in front mean it's accessing some sort of global? If not the case for this, what is ::read and ::write calling? – JoeManiaci Jun 05 '17 at 22:02
  • 2
    @JoeManiaci: Be careful with your terminology. Globals can be defined in namespaces other than the global namespace. The `::` prefix specifically denotes a symbol defined in the global namespace. I used it here to clarify that I am calling `read(…)` and `write(…)` functions defined in the global namespace, as opposed to perhaps some member functions. – Matt Whitlock Jun 07 '17 at 09:57
  • @MattWhitlock : All I see here http://www.learncpp.com/cpp-tutorial/42-global-variables/ is mention of file(global) or local scope. Are you referring to something like the Global symbolic constants that they have an example of below in that same link? – JoeManiaci Jun 15 '17 at 16:55
  • 1
    @JoeManiaci: Global functions aren't quite the same as global variables. I'm using the term "global function" as a synonym for "non-member function" — that is, a function not defined as a member of a type. There appears to be some disagreement as to whether these terms are really synonymous. Google's C++ Style Guide says, "Prefer placing nonmember functions in a namespace; use completely global functions rarely." However, "C++ Plus Data Structures," by Nell B. Dale, says, "Global scope is the scope of an identifier declared outside all functions and classes." – Matt Whitlock Jun 16 '17 at 18:34
  • @MattWhitlock Ah, gotcha completely now. Thanks. – JoeManiaci Jun 30 '17 at 15:53
  • 1
    std::make_unique_default_init has been renamed to [std::make_unique_for_overwrite](https://en.cppreference.com/w/cpp/memory/unique_ptr/make_unique). [Some more details](https://stackoverflow.com/questions/58050872/what-does-stdmake-unique-for-overwrite-do-isnt-it-redundant-with-stdmake) about the function. – pcworld Jun 18 '21 at 15:30
  • If C++20 allows the line in the middle code block to be written more "concisely" as the third code block, why is the line in third code block _longer_? – davidbak Sep 27 '21 at 19:52
  • @davidbak: Haha, you noticed that. I meant lexically concisely. It may be longer, but it's syntactically simpler. Code clarity is not directly correlated with code brevity. – Matt Whitlock Sep 28 '21 at 21:07
10

Yes, easy:

std::vector<char> myBuffer(N);
Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
5

I think its common to use a std::vector for this.

The benefits of using std::vector over an manually allocated char buffer are copy semantics (for passing to functions that wish to modify the data for their own purposes or when returning the data to a calling function).

Also a std::vector knows its own size reducing the number of parameters that need passing to processing functions and eliminating a source of bugs.

You have complete control over how the data is passed to other functions - either by reference or const reference as appropriate.

If you need to call an older c-style function with a plain char* and length you can easily do that too:

// pass by const reference to preserve data
void print_data(const std::vector<char>& buf)
{
    std::cout.fill('0');
    std::cout << "0x";
    for(auto c: buf)
        std::cout << std::setw(2) << std::hex << int(c);
    std::cout << '\n';
}

// pass by reference to modify data
void process_data(std::vector<char>& buf)
{
    for(auto& c: buf)
        c += 1;
}

// pass by copy to modify data for another purpose
void reinterpret_data(std::vector<char> buf)
{
    // original data not changed
    process_data(buf);
    print_data(buf);
}

void legacy_function(const char* buf, std::size_t length)
{
    // stuff
}

int main()
{
    std::ifstream ifs("file.txt");

    // 24 character contiguous buffer
    std::vector<char> buf(24);

    while(ifs.read(buf.data(), buf.size()))
    {
        // changes data (pass by reference)
        process_data(buf);

        // modifies data internally (pass by value)
        reinterpret_data(buf);

        // non-modifying function (pass by const ref)
        print_data(buf);

        legacy_function(buf.data(), buf.size());
    }
}
Galik
  • 47,303
  • 4
  • 80
  • 117
  • I don't believe that really helps me. The vector needs to be passed to other functions, so I'd need to allocate it with new and then free it. Besides, is a vector really designed for this type of I/O? Is the data block guaranteed? – SRobertJames Jun 02 '15 at 13:03
  • Yes, `std::vector`s storage is specified to be continuous. And thanks to move semantics it is also efficient to pass the vector around by value, no need for reference semantics (and thus memory management) – Fabio Fracassi Jun 02 '15 at 13:40
  • 3
    @SRobertJames You don't need to allocate your *vector* dynamically to pass it to other functions. Typically you pass by *reference* or *const reference*. I have updated the answer to demonstrate usage. Also `std::vector` is guaranteed to be a contiguous block. – Galik Jun 02 '15 at 13:44
  • 1
    @SRobertJames I don't understand why you'd have any issues passing it to other functions - just pass it by reference - if there are lifecycle issues you should mention them. – Benjamin Gruenbaum Jun 02 '15 at 14:26
  • "Is the data block guaranteed?" I assume you want a block of memory to do a bulk read into? Pass the size you need as an argument to the constructor shown to pre-allocate the buffer and initialize it to 0. There's another form of the constructor that allows you to also pass a value to initialize to in addition to the size, e.g. `std::vector buf(24, 0xff);` – Rob K Jun 02 '15 at 15:12
1

Use

std::vector<char> buffer(N)

If the size of the buffer never changes, you can use it as an array by doing this:

char * bufPtr = &buffer[0];

This will work in C++03 as well. See this comment https://stackoverflow.com/a/247764/1219722 for details on why this is safe.

Community
  • 1
  • 1
Dmitry Rubanovich
  • 2,471
  • 19
  • 27