-1

I want to convert a char buffer of known size (e.g. received from socket) into a string, but with the caveat that the char array is not necessarily null-terminated.

So I tried to use string (InputIterator first, InputIterator last) constructor. However, I notice that string::length() isn't always the same as strlen, at least in my case where strings are manually crafted from buffers with many trailing zeros.

#include <string>
#include <iostream>

#include <string.h>

using namespace std;

int main()
{
    char a[20] {0};

    a[0] = 'a';
    string b(a, a + 20);
    cout << b.length() << endl;
    cout << strlen(b.c_str()) << endl;
}

Output is

20
1

Though this is a well-defined behavior for string::length (thanks to the comments and some initial answers for helping me to realize that), I'd like to find a better / more idiomatic solution.

qweruiop
  • 3,156
  • 6
  • 31
  • 55
  • I believe you are flirting with disaster if the array is not null terminated. – AndyG May 11 '17 at 20:06
  • 1
    You told it to start from `a+0` all the way to `a+20`, so it did that, nulls and all. – Mooing Duck May 11 '17 at 20:07
  • If the char array is not null-terminated, and you don't already know the length, that's a bug. There's nothing you can do, with or without `std::string`, to recover from that. You need one or the other. – Mooing Duck May 11 '17 at 20:08
  • @AndyG well I just want to find the cleanest way to convert a (say) 32-byte char array to a string. Of course I can manually add a `\0` to terminate it. But string constructor is cleaner, no? What am I doing wrong? – qweruiop May 11 '17 at 20:10
  • @MooingDuck To be precise, I know the length. That's why I can say `a+20`. – qweruiop May 11 '17 at 20:12
  • 1
    @qweruiop: No, that's not the length. The length is 1. – Lightness Races in Orbit May 11 '17 at 20:21
  • @BoundaryImposition well, define the length. `string::length` is a length but it's 20. – qweruiop May 11 '17 at 20:23
  • 1
    @qweruiop: The entire premise of this question is that you are personally defining the [desired] length as 1, and are displeased when the program tells you 20 instead. Or did I misunderstand something? – Lightness Races in Orbit May 11 '17 at 20:25
  • @BoundaryImposition I don't think I define the length anywhere, did I? To the contrary, I'm confused by what length means. That's why I asked. – qweruiop May 11 '17 at 20:27
  • 1
    Well then you are going to have to explain what you mean by _"transform a char[] into a string"_. If you don't want this example to result in a string of length 1, and you don't want this example to result in a string of length 20, then I have no idea what you're asking. – Lightness Races in Orbit May 11 '17 at 20:28
  • @Ðаn One of them is mine. I thought I understood the question. Indeed the OP calls my solution "perfect". Yet these comments are now casting doubt on all that. – Lightness Races in Orbit May 11 '17 at 20:33

2 Answers2

3

The difference here is that, for a std::string, those NULL characters don't affect its length; it can hold them just fine.

However, for a c-style string, strlen with stop counting when it encounters the first NULL character, which for you is the second character, hence a size of 1.

Quentin
  • 62,093
  • 7
  • 131
  • 191
AndyG
  • 39,700
  • 8
  • 109
  • 143
  • 1
    @qweruiop: Yes. There's no requirement that `string::length` behave like `strlen` on the underlying array (In fact, it would probably be a bug if it did) – AndyG May 11 '17 at 20:10
2

You told it you wanted 20 bytes, so that's exactly what you got.

It sounds like you want to copy up to 20 bytes or until a null byte is encountered:

#include <string>
#include <iostream>
#include <cstring>

int main()
{
    char a[20]{};

    a[0] = 'a';
    std::string b(a, a + strnlen(a, sizeof(a)));
    std::cout << b.length() << '\n';
    std::cout << strlen(b.c_str()) << '\n';
}

You'll notice I've gone back to having the computer detect the number of input bytes for us, but with strnlen you can tell it to stop at 20, taking care of the problem that the array may not be null-terminated.

I've also changed {0} to {}, for style/sanity reasons.

(live demo)

Community
  • 1
  • 1
Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • @qweruiop: Don't know why you were surprised at the outcome then :P – Lightness Races in Orbit May 11 '17 at 20:15
  • I was surprised because `string::length` has a different semantic than `strlen`. That's somewhat counter-intuitive to me but it also makes sense after I know it. – qweruiop May 11 '17 at 20:21
  • 1
    C and C++ are different things, so it stands to reason that their string types work differently. In fact, C++ _having_ a string type and logical semantics that make sense is one of the main points of C++ over C. C's vulnerability to null nonsense is a limitation with which we are no longer encumbered. – Lightness Races in Orbit May 11 '17 at 20:22
  • I think I was just hoping to find a cleaner constructor. – qweruiop May 11 '17 at 20:25
  • Is there a problem with this one? – Lightness Races in Orbit May 11 '17 at 20:26
  • (other than my haphazard application of `std::` :$) – Lightness Races in Orbit May 11 '17 at 20:26
  • 1
    No this is a perfect solution. It would be nicer if the (non-existent) constructor could call `strnlen` and do all you wrote here in one line, no? – qweruiop May 11 '17 at 20:29
  • 1
    @qweruiop: I can see why you think that, but actually no. You're coming at this from the specific point of view of "I want C-like null-terminating behaviour", which is fine, but that is not the general case. In general, we do not concern ourselves with null-termination any more, in a string constructor or otherwise. A `std::string` is a sequence of bytes, nulls or otherwise. So it doesn't really make sense for that type to treat nulls specially by default. – Lightness Races in Orbit May 11 '17 at 20:32
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/144009/discussion-between-qweruiop-and-boundaryimposition). – qweruiop May 11 '17 at 20:32
  • Basically, forcing `std::string` to have your C mindset is bad, and encouraging you to lose your C mindset is good. :) And with that I'm off. – Lightness Races in Orbit May 11 '17 at 20:34