3

I have to interface with a C library in one of my programs, and I am writing a thin wrapper to use C++ types such as std::string. There are quite a number of functions that accept a char* and a maximum length as parameter and overwrite the memory pointed to by the char* with a zero-terminated C-string. Traditionally, I used a stack array and copied the data into a string manually.

std::string call_C_function(int x) {
    char buf[MAX_LEN] = {0}; // MAX_LEN is defined by the library
    size_t len = MAX_LEN;
    C_function(x, buf, &len);
    return std::string(buf, len);
}

I tried to avoid the copy and came up with a solution, but I don't know if this is strictly legal:

std::string call_C_function(int x) {
    std::string buf(MAX_LEN, '\0'); // MAX_LEN is defined by the library
    size_t len = MAX_LEN;
    C_function(x, &buf.front(), &len);
    buf.resize(len);
    return buf;
}

This compiles and works, but I am having doubts because the std::string class makes it quite hard to get a non-const pointer to the character data. c_str() and data() both return const pointers. The standard forbids modifications of the buffer explicitly for these functions:

21.4.7.1: The program shall not alter any of the values stored in the character array.

From the documentation, it seems that this is legal because front() returns a reference to the first character of the buffer which must be continuous (I am using C++11/14). In 21.4.5, the semantics of front() is defined in terms of operator[](0) which does not forbid modifications.

Are there any issues with this approach from a language standard point of view? It seems that this would be a loop-hole allowing the modification explicitly forbidden in 21.4.7.1.

Jens
  • 9,058
  • 2
  • 26
  • 43
  • 3
    My [previous question](http://stackoverflow.com/questions/14290795/why-is-modifying-a-string-through-a-retrieved-pointer-to-its-data-not-allowed) might help you there. – chris Dec 16 '15 at 13:39
  • 1
    Your function can just use a `std::vector` and thus skip the doubts in using `std::string` directly.. – PaulMcKenzie Dec 16 '15 at 13:44
  • @chris Should have read the last paragraph... I think it answers my question – Jens Dec 16 '15 at 13:45
  • @PaulMcKenzie I could use a `vector`, but the function returns a string which should be a std::string. – Jens Dec 16 '15 at 13:47
  • 1
    @Jens You could do a `return buf.data();` for `std::vector`, and it will work correctly if the return is a `std::string`. – PaulMcKenzie Dec 16 '15 at 13:49
  • 1
    @PaulMcKenzie I am not sure I understand. In your case, `buf` is a `std::vector`. But the return will first allocate memory in the return string, and then copy the data into it. – Jens Dec 16 '15 at 13:51
  • @PaulMcKenzie, but why would you want to do two allocations and two copies? The OP's code is fine. – Jonathan Wakely Dec 16 '15 at 13:55

2 Answers2

4

This is fine. You aren't allowed to modify the string via a pointer returned from a const member function but that doesn't mean you aren't allowed to modify the string at all. You have a non-const string, so you can modify it. In principle it's no different to doing s[0] = 'a'; s[1] = 'b'; etc.

It is forbidden to overwrite the final '\0' that is stored after the string data, but I assume from your example that MAX_LEN includes the space for the null terminator that is written by the C function, so the C function will only overwrite the string contents and not the extra character stored afterwards.

There is an open issue (LWG 2391) proposing to add a non-const std::string::data() which would be equivalent to &s.front(), and would give you direct access without making you concerned about going against the intention of the library.

Jonathan Wakely
  • 166,810
  • 27
  • 341
  • 521
3

According to my read of C++11:

const charT& front() const;

charT& front();

Requires: !empty()

Effects: Equivalent to operator[](0).

Let's look at operator[]:

const_reference operator[](size_type pos) const;

reference operator[](size_type pos);

1 Requires: pos <= size().

2 Returns: *(begin() + pos) if pos < size(), otherwise a reference to an object of type T with valuecharT(); the referenced value shall not be modified.

Since you preallocate your buffer, pos will be less than size(). begin() returns an iterator, so all the usual iterator traits apply.

So, according to my interpretation of C++11, this approach should be legal.

Sam Varshavchik
  • 114,536
  • 5
  • 94
  • 148
  • That basically my understanding as well, but I am wondering if there is something I overlooked because this seems to be a loophole to do what `data()` explicitly forbids. – Jens Dec 16 '15 at 13:52
  • But `data()` is a const function. It makes sense that you aren't allowed to modify the characters by getting a pointer from a const function (what if the entire string is const?). You are not calling a const function, and you are not trying to modify a const string. There is no problem. The standard does not forbid modification of strings! – Jonathan Wakely Dec 16 '15 at 13:54
  • @JonathanWakely Ok, so the part only states that I am not allowed to modify data through the const pointer explicitly. I took this for granted and read it as a more general constraint to not modify the character array bypassing the member functions for this. – Jens Dec 16 '15 at 13:56
  • No, it's a constraint _on those functions_ not on strings in general. – Jonathan Wakely Dec 16 '15 at 13:58