0

I have a function that manipulates a string, and I need it to work on both C-style strings, and C++ std::string:

// C-style overload
void TransformString(const char *in_c_string, char *out_string);
// C++ std::strings overload
std::string TransformString(const std::string &in_string);

In order to avoid redundant code, I can implement the actual algorithm in only one of them, and then have the other call it. So, if put the implementation in the C++ overloaded function, then the C-style function will look like:

void TransformString(const char *in_c_string, char * out_c_string) {
   std::string in_string(in_c_string);
   std::string out_string = TransformString(in_string); // call C++ std::string overload
   strcpy(out_c_string, out_string.c_str()); // unwanted memory copy
}

My question is: Can I do this (having the algorithm implemented in only one function) without the extra copy (from std::string internal buffer to the C-style string)? My first thought was to try and "steal" the buffer, like a string move constructor does, but upon searching the web it looks like there is no safe way to do this, as it is implementation specific. And if I write the algorithm in the C-style function, the problem is the same as in the C++ function I have to allocate space for the char* string, and then move it to the std::string object.
I must mention that I do not know the size of the resulting string before the transformation is completed.

Thank you.

EDIT

The size of the buffer is not a problem here (I know the max size and the function receives an allocated buffer). I cannot just return the std::string.c_str() because then the buffer would become invalidated when the std::string object would be destroyed (just after the return would occur). I have changed the name of the variable out_c_string. (thanks 0x499602D2)

bolov
  • 72,283
  • 15
  • 145
  • 224
  • Since the output size is not known beforehand, the caller cannot allocate memory for the result. Then I don't see how your C-style overload can work. You'd need `char **out_string` as the second parameter (or just return `char *`) – Praetorian Nov 20 '13 at 21:41
  • You have two variables named `out_string` in your example. – David G Nov 20 '13 at 21:41
  • could you just write the code for `std::string` then call that method from the c-style string function and use `std::string.c_str()` on the return value? – clcto Nov 20 '13 at 21:45
  • I don't know the actual size, but I do know the maximum size and the function receives an allocated buffer. – bolov Nov 20 '13 at 21:46
  • 1
    I cannot just return std::string.c_str() because the buffer gets invalidated when the std::string object is destroyed – bolov Nov 20 '13 at 21:47
  • I think you need to workout what the C interface will be first. As Praetorian mentions, with the API as it stands in the question how is the caller of the C function going to know how big the output buffer needs to be? Is the implementation of the function just going to assume the buffer is big enough (in general, that's a bad idea). You might want the caller to be able to pass in a buffer size and have the function return whether or not it was big enough (and ideally, how big it needs to be). Or do what Praetorian suggests and have the function allocate the output buffer. – Michael Burr Nov 20 '13 at 21:54

2 Answers2

2

As long as you know how big the output buffer needs to be you can create a std::string and resize it to the buffer size. You can then pass a pointer to the std::string buffer into the C-style overload.

#include <cstring>
#include <iostream>
#include <string>

void TransformString(const char *in_c_string, char *out_c_string) {
    size_t length = strlen(in_c_string);

    for (size_t i = 0; i < length; ++i)
        out_c_string[i] = '*';

    out_c_string[length] = 'a';
    out_c_string[length+1] = 'b';
    out_c_string[length+2] = 'c';
    out_c_string[length+3] = '\0';
}

std::string TransformString(const std::string &in_string) {
    std::string out;
    out.resize(100);

    TransformString(in_string.c_str(), &out[0]);
    out.resize(strlen(&out[0]));

    // IIRC there are some C++11 rule that allows 'out' to
    // be automatically moved here (if it isn't RVO'd)
    return out;
}

int main() {
    std::string string_out = TransformString("hello world");

    char charstar_out[100];
    TransformString("hello world", charstar_out);

    std::cout << string_out << "\n";
    std::cout << charstar_out << "\n";

    return 0;
}

Here is a live example: http://ideone.com/xwVWCh.

user2093113
  • 3,230
  • 1
  • 14
  • 21
  • This has the same issues as the c_str proposal: it's UB to modify the internal data of a std::string. (The fact that it "seems to work" is irrelevant.) – rici Nov 20 '13 at 21:55
  • @rici I don't think this is UB. See 21.4.1.5 and 21.4.5 from the Holy Text. – user2093113 Nov 20 '13 at 22:07
  • I'm not convinced. 21.4.5 allows you to modify the character array *through* the reference returned, but it still permits the std::string implementation to use a reference type which records the fact that a modification occurred. 21.4.7.1 forbids modification of the character array using the pointer returned by `data()`. If you could modify the character array through a pointer, what is the point of the prohibition in 21.4.7.1? – rici Nov 20 '13 at 22:13
  • @rici Even if the string used a proxy reference type it should still honour the edits so I don't see the issue. As for the lack of a non-`const` `data()` function, it appears that was an oversight: http://stackoverflow.com/questions/7518732/why-are-stdvectordata-and-stdstringdata-different – user2093113 Nov 20 '13 at 22:39
  • @rici: 21.4.5 doesn't preclude modification through `operator[]()` except in the case where `size()` is used as the index (ie., the terminating null character). See http://stackoverflow.com/questions/7766087/is-it-legal-to-modify-the-result-of-stdstringop Be sure to read both answers. – Michael Burr Nov 20 '13 at 22:48
  • @user2093113: OK, having traced through all the typedef's this time, I've convinced myself that C++11 doesn't allow proxy reference types from operator[]. So you're right, 21.4.5 does allow your use case, and the restriction on data() is inconsistent. (Odd that there is no DR, though.) – rici Nov 21 '13 at 02:30
-2

You could try to get the c style string from a string class using c_str(). You'll have to do const_cast<char*> to remove the const.

This will only work if you don't need to reallocate the string (keep the same size).

Sorin
  • 11,863
  • 22
  • 26
  • 3
    That's UB, and it will have potentially disastrous effects on implementations of the standard library which do copy-on-write (although there are fewer of those than there used to be). – rici Nov 20 '13 at 21:45
  • No, because the buffer gets invalidated when the std::string object is destroyed. – bolov Nov 20 '13 at 21:48