1

I have code for reversing a string. Let's say I type 'ABC', the output will be 'CBA'. However, there are some code lines I quite don't understand.

1    #include <stdio.h>
2    #include <string.h>
3
4    void print_reverse(char *s) {
5   size_t len = strlen(s);
6
7   char *t = s + len-1;
8   while(t >= s) {
9       printf("%c", *t);
10      t = t-1;
11  }
12  puts("");
13  }
14
15    int main()
16    {
17  char charinput[100];
18  printf("Enter character you want to reverse:");
19  fgets(charinput, 100, stdin);
20  print_reverse(charinput);
21  getchar();
22    }

What does line 7 and 8 do? What would be the output for the pointer t?

Garrett Hyde
  • 5,409
  • 8
  • 49
  • 55
Chozo Qhai
  • 283
  • 2
  • 10
  • 19

6 Answers6

8

The posted code uses the following algorithm:

  • Line 7: set the pointer t to the last character in the string (note: it will be a newline character if the user entered a string fewer than 99 characters). The -1 is to move one character back from the terminating nil-char
  • Lines 8-10: This is the core of the reversal reporting loop. The pointer t is repeatedly tested against the address at the beginning of the string. The condition clause checks to see if the t value (an address) is greater-or-equal to the beginning address of the string. So long as it is, the loop-body is entered and the character currently residing at the address held in t is sent to stdout via printf(). The address in t is then decremented by one type-width (one-byte on most-all systems with a single-byte char) and the loop repeats. Only when t contains an address before s does the loop break (and note: this is not within the standard; see below for why).

Something you should know about this loop (and should point out to the author if it isn't you). The final pointer comparison is not standard-compliant. The standard specifies comparison between non-null, like-type, pointers is valid from the base address of a valid sequence (charinput in this code, the address parameterized through s) up to and including one type-element past the allocated region of memory. This code compares t against s, breaking the loop only when t is "less". But as soon as t is less-than-s its value is no longer legally range-comparable against s. In accordance with the standard, this is so because t no longer contains a valid address that falls in the range from charinput through 1-past the size of the charinput memory block.

One way to do this correctly is the following:

t = s + len;
while (t-- > s)
    printf("%c", *t);

Edit: after a journey into the standard after prodding from Paul Hankin the prior code has been rewritten to account for an unnoticed UB condition. The updated code is documented below:

t = s + len;
while (t != s)
    printf("%c", *--t);

This will also work for zero-length strings. How it works is as follows:

  • t is set to the address of the terminating nulchar of the string.
  • Enter the loop, the condition being continue so long as the address in t is not equivalent to the base address of s.
    • Decrement t, then dereference the resulting address to obtain the current character, sending the result to printf.
    • Loop around for next iteration.
WhozCraig
  • 65,258
  • 11
  • 75
  • 141
  • 1
    @haccks ok. good. your analysis is accurate, that was the only thing I wanted to mention (and if you're like me, the first thing you started wondering after discovering this was "Gee, how often have I violated that rule in the past?" =) – WhozCraig Oct 12 '13 at 17:50
  • Just decrementing t beyond the start of the string is already undefined behavior, without the comparison. – Paul Hankin May 28 '15 at 07:58
  • @PaulHankin Step into the standard and cite where applying a post-decrement *without* subsequent eval invokes UB. At no point is a value outside the range of `s` evaluated. The post-decrement is intentional, and this was how I was taught to do this many, *many* moons ago, so if it is wrong I'd like to know why. – WhozCraig May 28 '15 at 08:04
  • Your code also decrements t past s too, so it also exhibits undefined behavior. – Paul Hankin May 28 '15 at 08:04
  • 1
    6.5.6.8 in the standard. Describing the result when you add a pointer and an integer: "If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined." – Paul Hankin May 28 '15 at 08:08
  • 1
    And 6.5.2.4.2-3 describes that `p--` has the same effect as subtracting one from `p` (and references 6.5.6 for discussion, effects and constraints). – Paul Hankin May 28 '15 at 08:14
  • Thanks, Now, what is the "result" in that context applied to the code posted? You submit the result is the value stored in `t`, and to that I concur its contained value is indeed undefined. Were I to utilize that value in anyway (such as the original code of the question), it would indeed invoke UB. But that isn't the result being evaluated. I'm quite certain we won't agree on this, and perhaps it shall make for an interesting question if not already posted. I would find it incredibly hard to believe it isn't already up here somewhere. – WhozCraig May 28 '15 at 08:14
  • 1
    In that paragraph, "result" is the result of an expression that adds an integer to a pointer. The standard makes no distinction about where result is stored (or that it's stored at all). – Paul Hankin May 28 '15 at 08:24
  • @PaulHankin you da man. I finally succumbed to this [as a posted question](http://stackoverflow.com/questions/30512669/does-applying-post-decrement-on-a-pointer-already-addressing-the-base-of-an-arra) and your assessment seems certainly valid. I shall edit the answer to include an addendum with a non-UB response. Thank you SO much for taking the time to peruse answers like this. It is *much* appreciated. – WhozCraig May 28 '15 at 17:33
3

Let's understand it step by step:

  1. len = strlen(s) will assign size of string s pointing to in bytes to the len (say this len is 10).

  2. s is pointing to the first character of the string. Let's assume the address of first element of this string is 100, then s contains 100.

  3. Adding len-1 to s will give 109.

Now, the line 7

   char *t = s + len-1;

tells the compiler that t is pointing to the element at address 109, i.e, last element of string.

Line 8

   while(t >= s) {

tells the compiler that loop will continue until t points to something before the first element of the string.

haccks
  • 104,019
  • 25
  • 176
  • 264
  • 1
    Similar to Matthias answer, the break-condition in this description is not accurate. `while(t >= s)` tells the compiler the loop until `t` points to something *before* `s`; not until it points to the first element. This is important, as it is what makes that loop non-compliant with the standard. – WhozCraig Oct 12 '13 at 09:55
  • @WhozCraig; I do not understand what you are telling? (sorry . Please elaborate). – haccks Oct 12 '13 at 09:57
  • 1
    See the second part of my answer. It pretty much tells it. – WhozCraig Oct 12 '13 at 10:00
  • 1
    @haccks I think @WhozCraig point is that the pointer `t` should only be assumed valid from `&s[0]` to `&s[strlen(s)+1+1]`. The routine attempts to make `t` with the value `&s[-1]`. Although this may work in many situations, you have created a pointer that is not _guaranteed_ to compare less than `s`. `t` is before the known range that `s` is good. C11dr 6.5.8 6 "... point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined" – chux - Reinstate Monica Oct 12 '13 at 15:28
  • @chux; I do not understand what is wrong in saying: `loop will continue until t points to first element`? – haccks Oct 12 '13 at 16:04
  • @haccks There is nothing wrong with "until t points to first element". This is not the issue. It is "until t points to _something before_ first element". – chux - Reinstate Monica Oct 12 '13 at 16:07
  • @chux; I understand, @WhozCraig is right. `until t points to first element` is not a condition to break the loop. – haccks Oct 12 '13 at 16:12
  • @haccks Example: Pointer `s` has the value 0x123400000000. Now on a machine, `size_t` is 4 bytes. Strings can range in _size_ from 1 ("") to power(2,32) - 1. OS takes advantage of the 32-bit limit & only performs 32-bit math on pointers, knowing the 0x1234 is left alone. Trouble occurs when trying to get before 0x123400000000. Decrement only operates on 32-bits would make 0x1234FFFFFFFF, rather than mathematically expected 0x1233FFFFFFFF. `while(t >= s)` would loop endlessly. This is why C specifies pointers derived from an array are only valid from the array beginning to one pass the end. – chux - Reinstate Monica Oct 12 '13 at 16:30
  • @chux; This is really a good example. But 2 another questions: 1. Why OS consider only 32 in case of pointer arithmetic? 2. What is the correct to loop out in this case? – haccks Oct 12 '13 at 16:58
  • @haccks The sample chux gave is an implementation-detail left to it; not you. By following the standard you avoid the problem he described. And if you want to see a loop that will still work in such an implementation, again, see my answer (the bottom), and think about how it works, writing it with some sample data on a piece of paper and manually running the algorithm if needed. – WhozCraig Oct 12 '13 at 22:36
  • 1
    @haccks Hypothetical OS would only use 32-bits for speed efficiency. It might be a 32-bit processor but with a 48-bit address space. Other examples: segment:offset, Page table and http://en.wikipedia.org/wiki/Physical_address – chux - Reinstate Monica Oct 13 '13 at 14:50
1

line 7: pointer t is pointing to the last character (s+len-1).
line 8: repeat the step when the address of the t equals or greater than the address of the s. suppose if s pointing to address of the first input string is 1101, the address of the next character is 1101+1=1102 and third is 1102+1=1103 and so on. so t point to 1101 + len-1 in line 7 would be 1101+10-1 (1110) if you input has 10 characters long.
line 9:print the character hold by address pointing by t. line 10: t is decremented by 1 and now point to the immediate left character.
9 and 10 repeated while the address is greater or equal (1110 in my illustration)

haccks
  • 104,019
  • 25
  • 176
  • 264
Miller
  • 1,096
  • 9
  • 22
0

t starts to point at the last character of the string s and in the following loop is decreased until it points to the first character. For each loop iteration the character is printed.

Matthias247
  • 9,836
  • 1
  • 20
  • 29
  • 1
    You may want to reconsider the "until" clause of this answer. Pointing to the first char isn't what breaks the loop. And the mechanism it *does* employ isn't standard-compliant. – WhozCraig Oct 12 '13 at 09:28
0

Line 7 sets the pointer t to point to the end of the string s. Line 8 is a while loop (which will go backward through the string, until the beginning). The pointer t is the current position in the string and is output on line 9.

cyphar
  • 2,840
  • 1
  • 15
  • 25
0

char *t = s + len-1; : To point to the last character of string s while(t >= s) : To scan all the characters of string s in reverse order (as s points to first character and we have made t point to last character in line 7).

Hope this helps.

exAres
  • 4,806
  • 16
  • 53
  • 95