4

I've been struggling to understand std::strlen() but in vain:

AFAIK strlen() returns the number of characters in a null-terminated constant character string in terms of bytes. If it is not null-terminated then the behavior is undefined. Apart from that, it is OK.

So: std::strlen(""); is 0.

But, because I've read about it on www.cppreference.com, I've found a possible implementation as:

// This is from:   https://en.cppreference.com/w/cpp/string/byte/strlen

std::size_t strlen(const char* start) {
     const char* end = start;
     while(*++end != 0);// I think this causes UB
     return end - start;
  }

But if I run it:

int main()
{
    const char cp1[] = "";
    const char cp2[] = "\0";
    const char cp3[] = "\0Hello";
    const char cp4[] = "H\0ello";
    const char cp5[1] = {};// UB?
    const char cp6[] = {'\0'};
    const char cp7[] = {'H', '\0'};

    cout << std::strlen(cp1) << " " << sizeof(cp1) << endl;// 0 1 OK
    cout << strlen(cp1) << " " << sizeof(cp1) << endl;// 1 1  is UB?

    cout << "\nDone!\n";
}

So what I see is that the version implemented on the website triggers an Undefined Behavior: The loop combines the pre-increment operator and de-reference operator in its condition and as we know the operators are of the same precedence level and they are evaluated from right to left. Thus, first increment the pointer and then de-reference it. In the case of the empty string, the pointer points one past the last character (null character) then de-references it, which is UB as far as I know.

Boann
  • 48,794
  • 16
  • 117
  • 146
Maestro
  • 2,512
  • 9
  • 24

2 Answers2

5

You are correct that the possible implementation has undefined behavior. *++end increments and then dereferences, which is UB on a empty string since you dereference the past the end element.

The possible implementation has since been changed to

std::size_t strlen(const char* start) {
   const char* end = start;
   while(*end++ != 0);
   return end - start - 1;
}

which is a correct implementation.

NathanOliver
  • 171,901
  • 28
  • 288
  • 402
1

be like this post inc,

std::size_t strlen(const char* const start) {
     const char* end = start;
     while(*end++ != 0);   // fixing I think this causes UB
     return --end - start;
  }