-3

I've done some research on strlen() and have a question.

Let's say I have an array of 50 elements and a pointer to the first element, meaning:

char A[50],*x;
gets(A);
x=&A[0];

From what I've understood, strlen(x) was supposed to give me the length of the string.

My question is, what exactly happens as I increment x?

randomuser
  • 299
  • 1
  • 4
  • 11
  • [`strlen()`](http://en.cppreference.com/w/c/string/byte/strlen) counts the bytes different than `0` starting from the address you pass it as argument. – axiac Mar 24 '18 at 21:32
  • 2
    Assume your string is `abcdef`. Then, `x` is pointing to the "a" and `strlen` will return 6. If you increment `x`, then `x` is pointing to `bcdef` and `strlen` will return 5 – Craig Estey Mar 24 '18 at 21:32
  • 1
    ...unless gets() over-reads crashes the process. – Martin James Mar 24 '18 at 21:50
  • 1
    ..or gets loads an empty string and so strlen(x+1) UB's. – Martin James Mar 24 '18 at 21:53
  • What happend when you tired it? – Martin James Mar 24 '18 at 21:54
  • Yes, allthough I seek some understanding of what is actually happening. The answer below is something I'm more than grateful for since I couldn't find such a detailed explanation anywhere else. I truly hope someone someday will find it useful. – randomuser Mar 24 '18 at 22:17

4 Answers4

7

First of all, and sorry for the digression, but please, never use the obsolete gets() function. Please use fgets instead.

In answer to your question, if x is a pointer to a valid nonempty string, strlen(x+1) will always equal strlen(x) - 1.

Suppose we have this string, with x pointing to it:

     +---+---+---+---+---+---+
  a: | H | e | l | l | o | \0|
     +---+---+---+---+---+---+
       ^
       |
   +---|---+
x: |   *   |
   +-------+

That is, x points to the first character of the string. Now, what strlen does is simply start at the pointed-to character and count characters until it finds the terminating '\0'.

So if we increment x, now it points to the 'e' (that is, it points to the string "ello"), like this:

     +---+---+---+---+---+---+
  a: | H | e | l | l | o | \0|
     +---+---+---+---+---+---+
           ^
           |
          /
         /
        /
       |
   +---|---+
x: |   *   |
   +-------+

So strlen will get a length that's one less.


Footnote: I'm reminded of an amusing bug I've come across more than once. When you use malloc to allocate space for a string, you always have to remember to include space for the terminating '\0'. Just don't do it this way:

char *p = malloc(strlen(str + 1));

My co-worker did this once (no, really, it was a co-worker, not me!), and it was stubborn to track down, because it was so easy to look at the buggy code and not see that it wasn't

char *p = malloc(strlen(str) + 1);

as it should have been.

Steve Summit
  • 45,437
  • 7
  • 70
  • 103
  • Perfect! It's interesting that while taking algorithms and data structure, we were never mentioned to include space for the '\0'. – randomuser Mar 24 '18 at 22:04
  • How did you draw the little pictures? Not by hand, I assume? – zzxyz Mar 24 '18 at 22:17
  • 1
    @zzxyz Yes, by hand, it's not hard, especially if you're handy with copy and paste. – Steve Summit Mar 24 '18 at 22:19
  • 1
    @frostpad the terminating `\0` is an implementation detail. Other languages use other techniques to know where the string ends; some of them store the size of the string and they don't need a terminator after it. The algorithms and data structures are theoretical concepts, they apply the same way no matter how the strings (or other data structures) are implemented by a particular language. – axiac Mar 25 '18 at 08:17
  • @SteveSummit *"`strlen(x+1)` will always equal `strlen(x) - 1`"* -- this is not true when `strlen(x) == 0`. – axiac Mar 25 '18 at 08:18
  • @axiac Sure, but do note that I said "if `x` is a pointer to a valid *nonempty* string". – Steve Summit Mar 25 '18 at 10:10
  • 2
    @axiac "the terminating `\0` is an implementation detail" You were addressing frostpad's lament that he didn't learn this fact in his algorithms class, but of course if we confine our attention to C, null termination isn't just an "implementation detail"; it's part of the language. – Steve Summit Mar 25 '18 at 10:13
  • @axiac - An 'implementation detail' to me is something you can forget about at least most of the time when using the language properly. There is not a single experienced C programmer that couldn't tell you this detail in their sleep. Because you need to know it for any non-trivial use of strings in the language. Unlike, say, C# strings. I could guess how the length is stored, and I'd almost certainly be right...but I'd be guessing. Which I'm guessing is why null-terminated strings aren't very popular in modern languages. – zzxyz Mar 26 '18 at 18:43
1

It will return less than it would have before. In C, a string is just a pointer to the memory address of the first character, so if your string was

"ABCDEF"

If you increment the pointer, instead of pointer to 'A' it will point to 'B', so the new string is

"BCDEF"

And strlen("BCDEF") is 5 whereas strlen("ABCDEF") is 6.

nemequ
  • 16,623
  • 1
  • 43
  • 62
0

It gives you one smaller. x += i is roughly equivalent to the pseudocode x = substr(x, i). (Note that bad things will happen if i is more than the length of x.)

0

Your code:

char A[50],*x;
gets(A);
x=&A[0];

strlen(x) was supposed to give me the length of the string

what happens as I increment x? Does strlen(x) now give me the same value as before or a smaller one and if so, why does it happen?

Well the answer is more complicated than one may think. The answer is: it depends!

By declaring A[50] compiler will allocate 50 bytes on the stack which are not initialized to any value.

Lets say that content of A happens to be

A[50] = { '5', '1', '2', '3', 0 /*.............*/ };

Then consider two scenarios:

a) user enters:<enter>

b) user enters:'7'<enter>

The content of the array A will be for different

a) { 0, '1', '2', '3', 0 /*.............*/ };

b) { '7', 0, '2', '3', 0 /*.............*/ };

and results of the strlen may surprise you:

This is test program and results:

#include <stdio.h>
#include <string.h>

int main(void)
{
    char  A[50] = { '5', '1', '2', '3', 0 };
    char *x;

    gets(A);
    x=&A[0];

    for (int i=0; i < 5; i++)
        printf("%d: %02X\n", i, A[i]);

    printf("strlen(x)  = %zu\n", strlen(x));  
    printf("strlen(x+1)= %zu\n", strlen(x+1)); 

    return 0;
}

Test:

<enter>                                                                                                                                         
0: 00                                                                                                                                        
1: 31                                                                                                                                        
2: 32                                                                                                                                        
3: 33                                                                                                                                        
4: 00                                                                                                                                        
strlen(x)  = 0                                                                                                                               
strlen(x+1)= 3  

7<enter>                                                                                                                                            
0: 37                                                                                                                                        
1: 00                                                                                                                                        
2: 32                                                                                                                                        
3: 33                                                                                                                                        
4: 00                                                                                                                                        
strlen(x)  = 1                                                                                                                               
strlen(x+1)= 0 

As you know strlen counts number of characters from starting position till first '\0' is encounter. If starting byte is equal '\0' than strlen(x) = 0.

For scenario a) strlen(x), strlen(x+1) will be 0 and 3
For scenario b) strlen(x), strlen(x+1) will be 1 and 0.

Please do not use gets (Why is the gets function so dangerous that it should not be used?) and also notice that I print ASCII chars in hexadecimal format e.g.'2' = 0x32.

sg7
  • 6,108
  • 2
  • 32
  • 40