10

I recently wrote a simple program to reverse a string, my program simply accepts an input string from user and then reverses it. This is all done with 2 pointers. Concerning the pointer safety in my program I wrote a function to duplicate the give string, This function allocates memory for a new (same length) string, and then copy the character 1-by-1.

My problem is when I run this program, Although It does what I need, It prints some mysterious extra output. It does this every time, before accepting input from the user. This is what happens when I run the program.

C:\Users\0xEDD1E\Desktop\revstr>revstr.exe
[C]
[.]
[revstr.exe]
hello
[hello]
olleh

here the last three lines are Input and Output, It's OK, The problem is in the first 3 lines

[C]
[.]
[revstr]

What are those? any way, here is my program

#include <stdio.h>
#include <stdlib.h>

#define swap(a, b) (((a) ^ (b)) && ((a) ^= (b), (b) ^= (a), (a) ^= (b)))

unsigned long strlen_(char *);
char *strdup(char *s);
char *reverseString(char *, int);

int main(void)
{
    //fflush(stdout);
    char *str = (char *) malloc(1024 * sizeof (char));
    scanf("%[^\n]s", str);
    int slen = strlen_(str);
    printf("%s\n", reverseString(str, slen));
    return 0;
}

unsigned long strlen_(char *s)
{
    char *p = s;
    while (*p) p++;
    return p - s;
}

char *strdup(char *s)
{
    char *p = (char *) malloc((size_t) (strlen_(s) + 1) * sizeof (char));

    if (p) {
        char *pp = p;
        while ((*pp++ = *s++)) ;
    }
    printf("[%s]\n", p);
    return p;
}


char* reverseString(char* s, int n) 
{

    char *str = strdup(s);
    char *p = str - 1;
    char *q = str + n;

    while (++p < --q) {
        swap(*p, *q);
    }

    return str;

}

you can see, looking at, strdup() function, those lines are generated by the printf() in the strdup() function (because of printf("[%s]\n, str);).

So I think I found the bug-point. But I can't figure out the reason for this error, and a way to fix it. Especially This happens only on Windows (mine is Windows 10 Pro). I tested this on Ubuntu(64bit), there is no bug like this in Ubuntu.

My instinct says me this error has to do something with pointers used in the functions

UPDATE: I tried fflush(stdout) but that didn't help

What is the reason behind this Bug (or misbehave), could someone explain the reason?

0xEDD1E
  • 1,172
  • 1
  • 13
  • 28
  • 3
    [Don't cast the result of `malloc` in C](http://stackoverflow.com/q/605845/995714) – phuclv Apr 23 '16 at 13:48
  • C11 draft standard n1570, *6.5.6 Additive operators 8 [...]If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.* `str - 1` is undefined. – EOF Apr 23 '16 at 13:50
  • thanks for the advice – 0xEDD1E Apr 23 '16 at 13:51
  • 3
    the scan specifier should be `"%1024[^\n]"`, not `"%[^\n]s"`. And `sizeof(char)` is redundant because it's always 1 – phuclv Apr 23 '16 at 13:51
  • Are you by any means using an implementation of `malloc()` different from the one provided by the compiler? – alk Apr 23 '16 at 13:53
  • @alk I have several compilers installed but I use only MingW GCC 4.9.3 for compiling. – 0xEDD1E Apr 23 '16 at 13:57
  • 1
    one more thing, [xor swap is not recommended](https://en.wikipedia.org/wiki/XOR_swap_algorithm#Reasons_for_avoidance_in_practice) because it's slower, less readable and introduces a lot of dependencies. Use simple swaps instead because modern compilers will all recognize and optimize them away – phuclv Apr 23 '16 at 13:57
  • 8
    It looks to me like the C-library startup/initialization uses `strdup()`. You provide a `strdup()` in your code, which `printf()`s the strings the init-code duplicates. – EOF Apr 23 '16 at 14:00
  • @LưuVĩnhPhúc, Oh that I did that for the educational purpose, that's the idea behind this program, any way thank you for pointing out the facts – 0xEDD1E Apr 23 '16 at 14:01
  • Good catch! @EOF. After the 2nd faux-pas today, I feel it's time to leave for the day ...:} – alk Apr 23 '16 at 14:03
  • 1
    @EOF, It worked!, I renamed the `strdup()` and the error is gone! thank you very much. – 0xEDD1E Apr 23 '16 at 14:04
  • 4
    Since you know that the extra output is coming from the `printf` in `strdup`, why not set a breakpoint on the `printf` and see who is calling `strdup`? (As noted, [redefining standard library functions is illegal](http://stackoverflow.com/questions/17631490/implementing-a-new-strcpy-function-redefines-the-library-function-strcpy).) – Raymond Chen Apr 23 '16 at 14:04
  • `scanf("%[^\n]s", str);` you don't need `[^\n]` anyway.. – sjsam Apr 23 '16 at 14:06
  • 2
    @sjsam the `s` is redundant also (it can never be matched - if `]` completes then `\n` must be the next character) – M.M Apr 23 '16 at 14:11
  • 2
    @sjsam No. `scanf("%[^\n]", str);` and `scanf("%s", str);` aren't the same. The former scans everything until a newline character while the latter scans everything (ignoring leading whitespaces) until a whitespace. – Spikatrix Apr 23 '16 at 14:12
  • @LưuVĩnhPhúc what do you mean by "introduces dependencies" in OP's XOR swap? – M.M Apr 23 '16 at 14:15
  • @CoolGuy: Sorry pal !! That was me a bit whiter than usual.. I just wanted to say what M.M just said. – sjsam Apr 23 '16 at 14:16
  • @M.M it means xor swap introduces dependencies between reads and writes so that the instructions must be done in-order. You can see that in the wikipedia article – phuclv Apr 23 '16 at 14:16
  • @LưuVĩnhPhúc, could you please provide the wikipedia page? – 0xEDD1E Apr 23 '16 at 14:23
  • 1
    @0xEDD1E it's in my previous comment https://en.wikipedia.org/wiki/XOR_swap_algorithm#Reasons_for_avoidance_in_practice – phuclv Apr 23 '16 at 14:26
  • Also, a lot of problems can arise due to UB if your `malloc` failed. And don't forget to `free` the allocated memory. – Spikatrix Apr 23 '16 at 14:31
  • 1
    do not write functions that are the same name as the system functions. I.E. give your functions: `strlen()` and `strdup()` unique names: for instance: `myStrlen()` and `myStrdup()` – user3629249 Apr 23 '16 at 18:31
  • 1
    when compiling, always enable all the warnings, then fix those warnings. – user3629249 Apr 23 '16 at 18:32
  • 1
    in your function: `strdup()`, this line: `printf("[%s]\n", p);` will be trying to print the characters at address 0 when the call to `malloc()` fails. Place the call to `printf()` inside the 'if()` code block – user3629249 Apr 23 '16 at 18:37
  • 1
    in the function: `reverseString()`, if the call to `strdup()` returns NULL, then the code following the call to `strdup()` will result in undefined behaviour and can lead to a seg fault event. – user3629249 Apr 23 '16 at 18:40
  • here is the algorithm for swapping two variables: `X := X XOR Y Y := Y XOR X X := X XOR Y` Notice there are only 3 elements, not 4 elements – user3629249 Apr 23 '16 at 18:43
  • 1
    in the `main()` function, the returned value from `reverseString()` can be NULL, so must not pass the returned value to `printf()` without first testing for NULL. – user3629249 Apr 23 '16 at 18:46
  • 1
    regarding this line: `scanf("%[^\n]s", str);` there are a couple of problems. 1) the format string has a spurious trailing 's'. 2) the input should be limited to 1 less than the length of the input buffer. Suggest: `scanf("%1023[^\n]", str);` And be sure to check the returned value (not the parameter value) to assure the operation was successful. – user3629249 Apr 23 '16 at 18:53
  • 1
    when calling the function: `malloc()` the returned value has type `void*`. A `void*` can be assigned to any pointer. Casting the returned value just clutters the code, making it much more difficult to understand, debug, maintain. (the expression `sizeof(char)` has already been discussed in a prior comment.) so the line be: Note: `strlen()` should return a `size_t`, not a `unsigned long` `char *p = malloc(strlen_(s) + 1);` – user3629249 Apr 23 '16 at 18:57
  • the `swap` macro: 1) should be a 'inline' function 2) should be written like: `#define swap(a, b) ( \ (a) ^= (b); \ (b) ^= (a); \` where each '\' is at the end of a line in the code (a) ^= (b); )` – user3629249 Apr 23 '16 at 19:03
  • 1
    @user3629249 that would be an error, you can't have semicolon-separated statements in side a parenthesized expression. The existing macro works – M.M Apr 24 '16 at 02:00
  • @M.M, Your right, My mistake, the wrapper should be braces not parens. The macro should be written like this: (where the '\' is at the end of each line in the macro) `#define swap(a, b) { \ (a) ^= (b); \ (b) ^= (a); \ (a) ^= (b); }` Still, writing a function like macro is discouraged, Much better to write a function with the 'inline' modifier. Note the posted macro `swap` results in the compiler outputting a warning about a computed value not being used, so the posted macro is not correct. – user3629249 Apr 24 '16 at 14:35

1 Answers1

16

Probably, your system's C library also defines a function strdup, and some code in the program startup calls that function. Then the linker links those calls to your strdup function instead of the system function.

To avoid this, don't name your function strdup, or str anything for that matter: names starting str and a lowercase letter are reserved (C11 7.31.12, 7.31.13); using them as function names in your own code causes undefined behaviour.

NB. char *p = str - 1; also causes undefined behaviour by doing pointer arithmetic outside the bounds of any object. (For example, imagine if str happens to be the first address in the address space). It would be good to restructure your algorithm to not point out of bounds.

M.M
  • 138,810
  • 21
  • 208
  • 365
  • yes, I understand matter with `str - 1` I had to do that as a quick adjustment for the `while` loop, I forgot to correct it. thanks any way! – 0xEDD1E Apr 23 '16 at 14:20
  • Doesn't `char *p = str - 1;` always invoke UB and not only when `str` happens to be the first address in the address space due to the fact that `p` doesn't point to a valid location (such as `str`) or one past the `malloc`ed memory segment? – Spikatrix Apr 23 '16 at 14:43
  • @CoolGuy yes, my example was some rationale as to why out-of-bounds pointer arithmetic is not defined. Some readers may be surprised to learn this fact and wonder why. – M.M Apr 23 '16 at 14:45