1

I'm trying to convert a std::string to a char* (copying rather than casting) due to having to pass some data to a rather dated API.

On the face of it, there are a number of ways to do this, but it was suggested that I do this as a vector which seemed sensible. However, when I tried this the result was garbled. The code is like:

const string rawStr("My dog has no nose.");
vector<char> str(rawStr.begin(), rawStr.end());
cout << "\"" << (char*)(&str) << "\"" << endl;

(Note the unpleasant C cast - using static_cast does not work which is probably telling me something)

When I run this I get:

"P/"

Clearly not right. I took a look at the vector in gdb

(gdb) print str
$1 = std::vector of length 19, capacity 19 = {77 'M', 121 'y', 32 ' ', 100 'd', 111 'o', 
  103 'g', 32 ' ', 104 'h', 97 'a', 115 's', 32 ' ', 110 'n', 111 'o', 32 ' ', 110 'n', 
  111 'o', 115 's', 101 'e', 46 '.'}

Which looks correct although there's no null terminator at the end, which is concerning. The size of the vector (sizeof(str)) is 24 which suggests the characters are being stored as 8-bits.

Where am I going wrong?

Component 10
  • 10,247
  • 7
  • 47
  • 64
  • If you use C++11, passing `&rawStr[0]` is fine as long as it only writes up to before the null. – chris May 30 '13 at 07:56
  • The string does not contain the null termination as an element within the range covered by `begin()` to `end()`. Second, the address of the vector is not the address of the first element of the vector's data. – juanchopanza May 30 '13 at 07:57
  • @chris: `[citation needed]` (the wording on this matter in the standard is reeeeeeally subtle and quirky, but as far as I know, that is not guaranteed to be null-terminated – jalf May 30 '13 at 07:59
  • 1
    @jalf, I believe it was something to do with `data()` being the same as `c_str()` in C++11. If you're interested, there's already a [question](http://stackoverflow.com/questions/6077189/will-stdstring-always-be-null-terminated-in-c11) on the subject. – chris May 30 '13 at 08:02
  • @chris sure, they're the same, but neither of them are the same as `&rawStr[0]` (And the answer to that question only states that the internal string will be null-terminated once you've called `c_str()` or `data()`. There is no guarantee that it will *always* be null-terminated. – jalf May 30 '13 at 09:18
  • @jalf, Is it really a case of being *that* underhanded? It's probably a good thing to get cleared up then. If [this proposal](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2534.html) was adopted (which I presume it was), wouldn't that require always being null-terminated since adding one on could invalidate iterators? Or would it still be vague enough that you could say they left space for something at the end and it's not necessarily a null until one of those is called? – chris May 30 '13 at 09:23
  • @chris: Yeah, it was pretty underhanded. However, that proposal seems to indicate that the array must already be null-terminated (since `data()` isn't allowed to modify "any of the values stored in the character array"). You may be right then. – jalf May 30 '13 at 11:10

4 Answers4

9

The instance of std::vector is not itself an array of characters - it points to an array. Rather than (char*)(&str) try &str[0].

Judging from your gdb output you'll also want to push a zero onto the end of the vector before passing it to your legacy API.

RobH
  • 3,199
  • 1
  • 22
  • 27
7

First, the std::string does not contain the null termination as an element within the range covered by [begin(), end()). Second, the address of the vector is not the address of the first element of the vector's data. For this you need &str[0] or str.data():

#include <vector>
#include <string>
#include <iostream>

int main()
{
  const std::string rawStr("My dog has no nose.");
  std::vector<char> str(rawStr.begin(), rawStr.end());
  str.push_back('\0');
  std::cout << "\"" << &str[0] << "\"" << std::endl;
  std::cout << "\"" << str.data() << "\"" << std::endl; // C++11
}
Marius Bancila
  • 16,053
  • 9
  • 49
  • 91
juanchopanza
  • 223,364
  • 34
  • 402
  • 480
2

Two things you need to do:

1) take the address of the first character in the vector using &str[0]; This is absolutely fine (if a little contrived) since the standard guarantees the vector memory is contiguous. You can't simply write &str as that is the address of the vector which is not necessarily the address of the first data element.

2) inject a null terminator at the end of your vector if you want to display the characters as a string using the standard c-like functions. I might be wrong on this second point; does rawStr.end() point at an implicit null terminator associated with "My dog has no nose."?

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
0

The &str gets you a pointer to the vector object, not to the contained string of characters.

If you wish to print it as a C string, you'll need to push a 0 onto the end, and then outputting &str[0] (which will grab you the address to the beginning of the contained array).

This is very ugly, though. You are much better off either creating your own string vector class which inherits std::vector or using a function crafted to iterate through a vector, printing each element literally.

Edit:

If you are privy to C++11, for_each with a lambda could be used here in a clean way:

std::for_each(str.begin(), str.end(), [](char i) -> void {std::cout << i;});
Taywee
  • 1,313
  • 11
  • 17