4

I want to write a library with an interface that provide a read function. C-style array is error prone but allow to pass a buffer of any size. C++ array are safer but impose to be constructed with a size.

// interface.h

// C-style array
int read (std::uint8_t* buf, size_t len);

// C++ array
int read (std::array<std::uint8_t, 16>& buff)

How can I have the best of both worlds?

I was thinking about function template but it does not seems practical for a library interface.

template <size_t N>
int read (std::array<std::uint8_t, N>& buf);

EDIT std::vector could be a good candidate but if we consider that char* and std::array do not have dynamic allocation.

EDIT I like a lot the solution with gsl::span. I am stuck with C++14 so no std::span. I don't know if using a third library (gsl) will be an issue/allow.

EDIT I did not think that using char over another type could have some influence on the answer, so to be clearer it is to manipulate bytes. I change char to std::uint8_t

EDIT Since C++11 guarantee that a return std::vector will moved and not copied, returning std::vector<std::uint8_t> is acceptable.

std::vector<std::uint8_t> read();
nbout
  • 1,159
  • 8
  • 12
  • 5
    Use `std::vector`? – Ron Feb 16 '18 at 15:15
  • If `N` would be a compile time constant, and you don't care about publishing the implementation of `read`, and you don't need your API to be C compatible, then there's nothing wrong with the templated version. – François Andrieux Feb 16 '18 at 15:16
  • 5
    How about [`gsl::span`](https://stackoverflow.com/q/45723819/2486888)? –  Feb 16 '18 at 15:16
  • It's hard to say with the limited information on what you're doing, but you might consider iterator-based interface that doesn't require a specific container. – Fred Larson Feb 16 '18 at 15:19
  • `std::vector` has dynamic allocation but `char*` or `std::array` don't. – nbout Feb 16 '18 at 15:19
  • The `C-Style` array (`char *`) would allow for dynamic allocation.. they just need to pass in the size. The `std::array` would be fixed size only. With the `C-Style` array, you get the best of both worlds but also the danger of a bad developer.. The developer using your library would be able to choose whether or not they want their own memory dynamically allocated or static. With `std::array`, they don't have a choice. With `std::vector` it will always be dynamic, but you'd get the safety of `std::array`. I personally prefer `C-Style` array. Let the developer worry about their own safety. – Brandon Feb 16 '18 at 15:21
  • @boutboutnico how would you know whether the `char*` points to dynamic memory? – Quentin Feb 16 '18 at 15:22
  • @Quentin Well you cannot know but you can have static allocation – nbout Feb 16 '18 at 15:25
  • @NickyC `gsl::span` is a nice thing, but I would be very careful making my library dependent on GSL just because of this issue. Hopefully, once there will be `std::span`... – Daniel Langr Feb 16 '18 at 15:34
  • Is the `char` intentional (indicating text), or do you really mean to write `std::byte` or `unsigned char` (indicating bytes)? – Cheers and hth. - Alf Feb 16 '18 at 15:47
  • The first version is fine and doesn't constraint you to a C array at all. – Lightness Races in Orbit Feb 16 '18 at 16:15
  • This question is too broad because you tell us too little about your library. Like, should it be compatible with C, is it used only in your own projects, should it be a DLL, and so on. Chances are that you can just use a templated function and that *"it does not seems practical for a library interface"* is just a wrong feeling. – Christian Hackl Feb 16 '18 at 16:23

8 Answers8

6

You could do what the standard library does: Use a pair of iterators.

template <typename Iter> int read(Iter begin, Iter end)
{
    // Some static assets to make sure `Iter` is actually a proper iterator type
}

It gives you the best of both worlds: Slightly better safety and ability to read into an arbitrary part of a buffer. Also it allows you to read into non-continguous containers.

HolyBlackCat
  • 78,603
  • 9
  • 131
  • 207
  • 1
    The level of 'safety' is the same as with dynamic arrays. – SergeyA Feb 16 '18 at 15:28
  • 1
    Also consider using iterator traits to restrict the type of iterator, depending on the functionality it is given. – Jorge Bellon Feb 16 '18 at 15:30
  • read function with a set of iterators is a very unusual. Care to explain the benefits over standard void* and size? – SergeyA Feb 16 '18 at 15:41
  • @SergeyA It's not *that* much safer. But at least it's protected against silly bugs like `void *buf = blah; int status = read(&buf, buf_len);`. – HolyBlackCat Feb 16 '18 at 15:54
  • You don't need static_asserts in there. Just do the usual `begin++` and `begin != end`. If those won't work, the compiler will tell you. If you want to do random reads, then it might be useful to have the static_asserts and some messages, just so the user is not confused when compiling your library. – smac89 Feb 16 '18 at 15:55
  • 1
    @smac89 `static_assert` gives you a better description of the problem though. I would prefer getting *"static_assert: `Iter` is not a valid %iterator_category% iterator"* instead of tons of cryptic error messages. – HolyBlackCat Feb 16 '18 at 15:57
  • @HolyBlackCat, true, true, I just updated my comment to that effect. – smac89 Feb 16 '18 at 15:58
  • @SergeyA, I don't find it unusual to have a read function use iterators. When you think of SRP (Single responsibility principle), a function which reads should just do that, and not have to worry about the size of the buffer given to it to read from. Short of re-writing the _gets()_ function from C's long lost past, using iterators is a much cleaner approach – smac89 Feb 16 '18 at 16:04
  • @smac89 I will agree with you if you show me any `read` function from any wide-used APIs which takes a pair of iterators. – SergeyA Feb 16 '18 at 16:08
  • @SergeyA I suggest that [`std::generate`](http://en.cppreference.com/w/cpp/algorithm/generate), in a way, might fulfill that requirement. It "reads" values from the generator into the range of iterators provided. – François Andrieux Feb 16 '18 at 16:12
  • @SergeyA, as long as we are not bound to functions called **read**, just take a look at any function in `std::algorithm`, they all take a pair of iterators when it comes to reading from a container. I also maintain a csv library (currently in progress) which makes extensive use of `Reader` objects, which take a pair of iterators (`istream_iterator`, std::begin(container) iterators, whatev_iter). I also program mainly in Java, and the `Stream` api is based on iterators - albeit slightly different from std::iterators, but the point is, none of them use **size** to determine when to stop reading. – smac89 Feb 16 '18 at 16:18
  • @smac89 I believe, your Java experience is not very relevant here. And I think, there is a reason why the function is called read. – SergeyA Feb 16 '18 at 16:34
2

How can I have the best of the two world ?

By using std::vector:

  • Like std::arrays: It is safer than C-arrays.
  • Like C-arrays: It allows you to work with functions that must be able take an array of arbitrary size.

EDIT: std::vector does not necessarily imply dynamic allocation (as in dynamic storage duration). That depends on the allocator used. You can still provide a user-specified stack allocator.

JFMR
  • 23,265
  • 4
  • 52
  • 76
1

I will go against the grain and say that for read-type function taking void* pointer and size are likely the best option. This is the approach taken with any unformatted read functions around the world.

SergeyA
  • 61,605
  • 5
  • 78
  • 137
  • I'd like to enforce type-safety – nbout Feb 16 '18 at 15:36
  • 1
    How does `char*` enforce the type-safety? – SergeyA Feb 16 '18 at 15:41
  • @SergeyA: Because the OP is handling text. If it were bytes it would be `std::byte`, or `unsigned char`. That's what the code communicates to humans, anyway. – Cheers and hth. - Alf Feb 16 '18 at 15:46
  • @Cheersandhth.-Alf last time I checked, `std::*stream` unformatted read functions were taking `char*` (to be exact, `char_type` which is `char*` in default stream). – SergeyA Feb 16 '18 at 15:51
  • The streams are mainly about text, and they're documented, and they have a history back to old C. `std::byte` made it into C++17. – Cheers and hth. - Alf Feb 16 '18 at 15:55
  • @Cheersandhth.-Alf unsigned char was known even back than, and read/write are *unformatted* input/output functions, and as such has nothing to do with text. – SergeyA Feb 16 '18 at 15:56
  • @SergeyA: Uhm. When I mentioned old C it was mostly about the coding practice back then: favoring terseness, primarily communicating to the compiler. Today terseness is not so extremely highly valued, while communicating to humans (maintainers, users, future self) is. Re unformatted that doesn't mean binary, it just means not formatted. – Cheers and hth. - Alf Feb 16 '18 at 16:04
1

Why don't you use a gsl::span, which was meant for the purpose of eliminating pointer and length parameter pairs for a sequence of contiguous objects? Something like this would work:

int read(gsl::span<uint8_t> buf)
{
    for (auto& elem : buf)
    {
        // Do whatever with elem
    }
}

The only problem is that unfortunately, gsl::span is not part of C++ standard (Maybe it might be in C++20), and installing it would require a library such as GSL-lite

Here are more details about span, from Herb Sutter.

Arnav Borborah
  • 11,357
  • 8
  • 43
  • 88
0

Do you really care about the type of the underlying container ?

template<typename Iterator>
int read_n(Iterator begin, size_t len);

Assuming this function returns the number of elements read, i would change the return type to size_t as well.

char *dyn = new char[20];
char stat[20];
std::vector<char> vec(20);
read(dyn, 20);
read(stat 20);
read(vec.begin(), 20);
thomas
  • 51
  • 8
  • 1
    Taking an iterator and a size is very unusual in c++. The implementaiton will almost certainly need to calculate the end iterator anyway. The problem gets worse with non-random access iterators, if it's desirable to support them. – François Andrieux Feb 16 '18 at 15:38
  • Indeed, But in fact i was thinking about ContiguousIterator, which allow for `memcpy(&(*begin), ..., len);` to be used by the implementation – thomas Feb 16 '18 at 15:51
  • If you want to rely on memcpy then why not simply use a pointer instead? That's more something one would expect in C++. If you just want to provide a specialization for ContiguousIterator that allows better performance I'd still go with begin and end iterator and not a length, then one could also pass something like `dyn.begin()+5, dyn.end()` without having to substract the begin offset from the length – Jimmy R.T. Dec 09 '20 at 11:33
0

I think when designing lib's interface you need to take in considiration where it will be used.

Library with C interface with "char *" can be used with wide variety of languages (C, C++ and others ). Using std::array limits your lib's potentional clients.

The third possible variant:

struct buf_f allocBuf();

int rmBuf( struct buf_t b );

int read( struct buf_f b );

char * bufData( struct buf_f b );

size_t bufSize( struct buf_f b );

Surely it can be rewritten with C++ in more elegant way.

sim
  • 756
  • 6
  • 18
0

You can use make a wrapper function that is a template function that delegates to the C-interface function:

int read(std::uint8_t* buf, size_t len);

template <size_t N>
int read(std::array<std::uint8_t, N>& buf)
{
  return read(buf.data(), buf.size());
}

I've found such constructs useful when I need to something over an C ABI but didn't want to lose some of the comforts that C++ gives to, as the template function is compiled as part of the library client code and doesn't need to be C ABI compatible while thee function the template function call is C ABI compatible

Jimmy R.T.
  • 1,314
  • 1
  • 10
  • 13
-3

Just return a std::vector<uint8_t>, unless this is a DLL, in which case go with a C style interface.

Note: answer changed from std::string to std::vector after change of question from char to uint8_t.

Cheers and hth. - Alf
  • 142,714
  • 15
  • 209
  • 331
  • There is nothing in the question to suggest the buffers contain strings. It may be binary data that, while `std::string` may be able to store them, is better represented by other containers. – François Andrieux Feb 16 '18 at 15:39
  • @FrançoisAndrieux: `std::string` is well suited for any byte sequence. It is so used by Google. "better represented by other containers" is an opinion I wouldn't give much credence to without supporting evidence, given that you failed to notice the plain `char` type here. – Cheers and hth. - Alf Feb 16 '18 at 15:40
  • anonymous downvoters, consider rational discussion instead of mindless voting (which just says you're unable to argue your position). consider that a `std::string` is the only up-front reasonable design. what makes you downvote the currently accepted best practice? – Cheers and hth. - Alf Feb 16 '18 at 15:43
  • I'm not anonymous. I don't recognize using `std::string` as *"the only up-front reasonable design"* nor do I recognize it as *" the currently accepted best practice"*. `std::string` is usually expected to contain displayable character data. And while that's not formally required, it's a nasty surprise for anyone not familiar with it. Perhaps it works well in your environment but I disagree that it should be recommended as a general solution to this problem, thus the downvote. I *do* see how SSO might be useful, but I consider using `std::vector` instead makes the intention much clearer. – François Andrieux Feb 16 '18 at 15:52
  • @FrançoisAndrieux: First of all `std::string` is used this way in several binary interfaces, e.g. [Google protocol buffers](https://stackoverflow.com/questions/9373325/google-protocol-buffers-and-use-of-stdstring-for-arbitrary-binary-data) (I just googled that). Secondly, the OP has vaguely indicated text (via `char` in modern C++), and not mentioned anything about binary. Oh, there's [a tool usage argument](https://stackoverflow.com/a/47537044/464581) in favor of `std::string`. And ditto conversion argument. Why not be practical? – Cheers and hth. - Alf Feb 16 '18 at 15:58
  • Oh, now the OP has changed the question. – Cheers and hth. - Alf Feb 16 '18 at 16:08
  • Is it ok to return a `std::vector` by copy ? – nbout Feb 16 '18 at 16:29
  • @boutboutnico: Yes, all standard containers can be returned by copy. In many cases that will not cause actual copying, but just a logical copy, because move semantics kicks in, and/or because of Return Value Optimization. – Cheers and hth. - Alf Feb 16 '18 at 16:30
  • No I was afraid about copying it but C++11 guarantee it will be moved. – nbout Feb 16 '18 at 16:32