51

If the array was null-terminated this would be pretty straight forward:

unsigned char u_array[4] = { 'a', 's', 'd', '\0' };
std::string str = reinterpret_cast<char*>(u_array);
std::cout << "-> " << str << std::endl;

However, I wonder what is the most appropriate way to copy a non null-terminated unsigned char array, like the following:

unsigned char u_array[4] = { 'a', 's', 'd', 'f' };

into a std::string.

Is there any way to do it without iterating over the unsigned char array?

Thank you all.

karlphillip
  • 92,053
  • 36
  • 243
  • 426

12 Answers12

66

std::string has a constructor that takes a pair of iterators and unsigned char can be converted (in an implementation defined manner) to char so this works. There is no need for a reinterpret_cast.

unsigned char u_array[4] = { 'a', 's', 'd', 'f' };

#include <string>
#include <iostream>
#include <ostream>

int main()
{
    std::string str( u_array, u_array + sizeof u_array / sizeof u_array[0] );
    std::cout << str << std::endl;
    return 0;
}

Of course an "array size" template function is more robust than the sizeof calculation.

CB Bailey
  • 755,051
  • 104
  • 632
  • 656
  • Converting `unsigned char *` into `char *` here, you have to do `reinterpret_cast`. –  Jan 14 '11 at 13:55
  • 1
    @VladLazarenko: But I don't want to do that conversion. – CB Bailey Jan 14 '11 at 13:56
  • @Charles: Then your code won't compile unless you change default type for char to unsigned in compiler's settings ;) –  Jan 14 '11 at 13:57
  • @Vlad Lazarenko: It does compile :) – cpx Jan 14 '11 at 13:58
  • @Vlad Lazarenko: `unsigned char*` satisfies the requirements for an input iterator. My code compiles and works just fine. – CB Bailey Jan 14 '11 at 13:58
  • 2
    @Charles, stop using buggy compilers. Constructor signature you need to call is `std::string (const char *, size_t)`, as `unsigned` is not converted to `signed` implicitly, passing `unsigned char *` will introduce ambiguity. Check with proper compiler, or see, for example, - http://stackoverflow.com/questions/804123/const-unsigned-char-to-stdstring –  Jan 14 '11 at 14:05
  • @VladLazarenko: I don't _need_ to call that constructor, I'm quite happy with this constructor: `template basic_string(InputIterator begin, InputIterator end, const Allocator& a = Allocator());` – CB Bailey Jan 14 '11 at 14:06
  • @Charles: Oh, I must be blind. Of course, you use two pointers here. Didn't notice `u_array + ` in second argument. My bad. +1 for your answer then. –  Jan 14 '11 at 14:12
  • 6
    FYI, the division by `sizeof u_char[0]` is completely redundant. This size is *guaranteed* by the standard to be equal to the size of `char`, which is 1 by definition. – Konrad Rudolph Jan 14 '11 at 16:18
  • @KonradRudolph. I was in two minds about taking it out and leaving it in. Some people are making the valid argument elsewhere that it's robust against a change in type of `u_char` but it's marginal either way IMHO. – CB Bailey Jan 14 '11 at 16:22
  • 6
    @Konrad: I believe Charles chose to show the general code, so as not to mislead readers into just doing a `sizeof` for e.g. `wchar_t`. – Cheers and hth. - Alf Jan 14 '11 at 16:22
  • 7
    Or you can simply replace the second parameter with `std::end(u_array)` (C++0x) – Blastfurnace Jan 14 '11 at 16:33
27

Well, apparently std::string has a constructor that could be used in this case:

std::string str(reinterpret_cast<char*>(u_array), 4);
karlphillip
  • 92,053
  • 36
  • 243
  • 426
  • 3
    More of ideological thought, but it would be nicer not to cast away the constness of array. Plus, taking size of it instead of hard-coding error-prone `4`. –  Jan 14 '11 at 13:52
7

When constructing a string without specifying its size, constructor will iterate over a a character array and look for null-terminator, which is '\0' character. If you don't have that character, you have to specify length explicitly, for example:

// --*-- C++ --*--

#include <string>
#include <iostream>


int
main ()
{
    unsigned char u_array[4] = { 'a', 's', 'd', 'f' };
    std::string str (reinterpret_cast<const char *> (u_array),
                     sizeof (u_array) / sizeof (u_array[0]));
    std::cout << "-> " << str << std::endl;
}
4

This should do it:

std::string s(u_array, u_array+sizeof(u_array)/sizeof(u_array[0]));
Veger
  • 37,240
  • 11
  • 105
  • 116
cpx
  • 17,009
  • 20
  • 87
  • 142
  • u_array is of type unsigned char, and `std::string`'s constructor take `const char *`, so this won't even compile. –  Jan 14 '11 at 13:48
  • @Vlad Lazarenko: No, as i checked it should be just fine. – cpx Jan 14 '11 at 13:57
  • @Dave, default type for char is signed, not unsigned, and it cannot implicitly convert one into another. Your compiler either treating `char` as `unsigned` or it is buggy. In any case, generic solution should not rely on these specifics and use explicit conversion. You can check this with Comeau online or something, it doesn't work. –  Jan 14 '11 at 14:02
  • `/4` ? The array is an array of `unsigned char`. Why is it relevant that an `int` is 4 bytes (even if this assumption happens to be correct)? – CB Bailey Jan 14 '11 at 14:02
  • @VladLazarenko: An `unsigned char` _can_ be converted to a `char`, they are both integral types (e.g. `char x = (unsigned char)10;`) The result is implementation-defined if the value of the `unsigned char` is not expressible in a `char` but it is a valid conversion. – CB Bailey Jan 14 '11 at 14:04
  • @Charles: of course they can be converted. But they cannot be converted implicitly. Your compiler must have `char` as `unsigned` by default, that makes sense, but is non standard, I guess. Or it must be very smart to look into compile-time constant array and decide that it can be converted into array with signed values. –  Jan 14 '11 at 14:07
  • 1
    @VladLazarenko: Any integer type can be converted to any other integer type: 4.7 [conv.integral] . This includes `unsigned char` and `char`. – CB Bailey Jan 14 '11 at 14:11
3

std::string has a method named assign. You can use a char * and a size.

http://www.cplusplus.com/reference/string/string/assign/

foobar1234
  • 91
  • 1
  • 2
    It also has a constructor that takes char pointer and size. In those cases when you don't have a string instance yet, it will make sense to use constructor. –  Jan 14 '11 at 13:53
  • The problem with this situation is you then don't know how many bytes your string takes up, and whether or not doing .c_str() will give you a valid c string or not. –  Nov 19 '12 at 02:13
3

You can use this std::string constructor:

string ( const char * s, size_t n );

so in your example:

std::string str(u_array, 4);
Benoit Thiery
  • 6,325
  • 4
  • 22
  • 28
  • 1
    You can make it better by doing `sizeof (u_array)`. Or even better - `sizeof (u_array) / sizeof(u_array[0])`, which will work for data types who`s size is greater than 1 byte. –  Jan 14 '11 at 13:50
1

You can create a character pointer pointing to the first character, and another pointing to one-past-the-last, and construct using those two pointers as iterators. Thus:

std::string str(&u_array[0], &u_array[0] + 4);
Raedwald
  • 46,613
  • 43
  • 151
  • 237
  • 1
    This is error prone as size of array can change, and you might easily forget to replace your `4` with a new value. Plus, there is no point in doing `&u_array[0]`, it is equivalent of just `u_array`, which is much less typing. –  Jan 14 '11 at 13:49
1

Although the question was how to "copy a non null-terminated unsigned char array [...] into a std::string", I note that in the given example that string is only used as an input to std::cout.

In that case, of course you can avoid the string altogether and just do

std::cout.write(u_array, sizeof u_array);
std::cout << std::endl;

which I think may solve the problem the OP was trying to solve.

1

There is a still a problem when the string itself contains a null character and you try to subsequently print the string:

char c_array[4] = { 'a', 's', 'd', 0 };

std::string toto(array,4);
cout << toto << endl;  //outputs a 3 chars and a NULL char

However....

cout << toto.c_str() << endl; //will only print 3 chars.

Its times like these when you just want to ditch cuteness and use bare C.

plgDavid
  • 11
  • 1
0

Try:

std::string str;
str.resize(4);
std::copy(u_array, u_array+4, str.begin());
tibur
  • 11,531
  • 2
  • 37
  • 39
0

std::string has a constructor taking an array of char and a length.

unsigned char u_array[4] = { 'a', 's', 'd', 'f' };
std::string str(reinterpret_cast<char*>(u_array), sizeo(u_array));
johannes
  • 15,807
  • 3
  • 44
  • 57
0

Ew, why the cast?

 std::string str(u_array, u_array + sizeof(u_array));

Done.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055