0

I am new to C and facing an odd behavior when using malloc.

I read a input text from stdin (fgets) and pass it to a function myfunction.

void myfunction(char* src) {

     printf("src: |%s|\n", src);

     int srcLength = strlen(src);
     printf("src length: %d\n", srcLength);

     // CAUSES ODD BEHAVIOR IN MY SITUATION
     // char* output = malloc(200);
     //
     // if (output == NULL) {
     //   exit(EXIT_FAILURE);
     // }

     for (int i=0; i < srcLength; ++i) {
       char currChar = src[i];
       printf("|%c| ", currChar);
     }
}

When executing the function without malloc (see comment), I am getting this:

src: |asdf|
src length: 4
|a| |s| |d| |f|

But with malloc, I am getting this awkward behaviour. As if there were no chars in the char*:

src: |asdf|
src length: 4
|| || || || 

There might be an issue with the char* src (from stdin). But I am unsure because the input string is printed correctly (src: |asdf|).

Can anybody support me, how to analyze the source of the problem?

UPDATE 1:

Here is the code for reading from stdin and invoking myfunction.

int main(int argc, char **argv) {

  char *input = NULL;
  input = readStdin();
  myfunction(input);
  return EXIT_SUCCESS;
}

char* readStdin(void) {
    char buffer[400];
    char *text = fgets(buffer, sizeof(buffer), stdin);
    return text;
}

The myfunction and readStdin are in different files, but I hope it does not matter.

UPDATE 2:

As proposed by the supporters in the comments, I did a resolution of the scope issue.

I changed the function prototype of readStdin to:

 char* readStdin(char* input);

And I invoke readStdin with the allocated input.

 char* input = malloc(400);

In readStdin I replaced buffer with the function parameter.

Bernhard
  • 703
  • 5
  • 25
  • 2
    Can you show the code where you are reading the string from stdin? What happens if you move that print of the source string to after the malloc? Is the only difference between those outputs that you uncomment that line of code or are you changing source of your print statement? – pstrjds Oct 29 '17 at 17:46
  • 7
    You need to post a [mcve]. The `malloc()` has no apparent effect in your code but if you've corrupted the heap elsewhere even apparently harmless code can cause an underlying problem to exhibit. – Persixty Oct 29 '17 at 17:47
  • 1
    What happens if the `malloc()` is done before the `printf()` of the src? I am wondering if that still prints `src` correctly. – donjuedo Oct 29 '17 at 17:48
  • @pstrjds Moving the `printf("src:` below the malloc leads to `src: ||`. I do not change anything else except for the comment. Does not sound easy to solve... I will see if I can build a minimal, full reproducible example. – Bernhard Oct 29 '17 at 17:50
  • 2
    So, obviously you have corrupted the heap elsewhere in your code. Please post the code where you are reading the string from stdin and where you are calling this method. As has been stated, please post an [MCVE](https://stackoverflow.com/help/mcve) – pstrjds Oct 29 '17 at 17:52
  • How is src allocated? – Coderino Javarino Oct 29 '17 at 17:52
  • 1
    @Bernhard_S - I would guess that if you post your `main` and this function, that would be an MCVE. If you are new to C than I would not expect your program is very long. You could probably just post that. – pstrjds Oct 29 '17 at 17:58
  • @pstrjds I will post it in a second. Unfortunately the program has other functionalities too, so there is a little need to minify. – Bernhard Oct 29 '17 at 18:02
  • 1
    `text` : It is invalid outside function scope. – BLUEPIXY Oct 29 '17 at 18:13
  • You return `text` from `readStdin`. `text` is an alias to `buffer`, which is a local variable that goes out of scope after you return from `readStdin`. – M Oehm Oct 29 '17 at 18:14
  • So sometimes the print does work, but it causes odd behavior? – Bernhard Oct 29 '17 at 18:15
  • 2
    Once you've left `readStdin`, `buffer` becomes invalid. Accessing it is _undefined behaviour_, which you should avoid. (I guess that the problem is not the `malloc`, but the additional variable `output`, which happens to occupy part of the space that was previously occupied by `buffer`, thus corrupting it.) – M Oehm Oct 29 '17 at 18:18
  • Is my proposed solution for the scope-issue ok (see Update 2)? It seems to work. – Bernhard Oct 29 '17 at 18:32
  • Yes, that will work. The only thing I would say is that you are allocating the full `400` `bytes`. This is a non-issue in your example, but consider processing numerous strings of unknown lengths (i.e. when reading from a file). Instead of allocating 400 bytes for each line, you can have an automatic buffer of size 400 that you keep re-using, and then you only allocate the memory that you need when the size of each line is known at runtime. See the example in my answer. This is totally splitting hairs, but good to think about these things while you're learning. – Nik Oct 29 '17 at 18:55

3 Answers3

1

Odd behaviour when using malloc

Yes, it's odd.... or perhaps not. You code has undefined behavior so everything may happen.

The problem is that text will end up as a pointer to buffer if fgets is succesful. But buffer is a local variable in the function so as soon as readStdin returns the variable buffer doesn't exist anymore. Therefore you pass myfunction an invalid pointer and when you use it (i.e. for read/write), you have undefined behavior.

Once you have undefined behavior it makes no sense to reason about what is going on.... but if we try to do that anyway, a likely explanation for most systems is:

buffer was located on the stack. When readStdin returns, the stack pointer is decrement (or incremented) so that buffer is now on the unused part of the stack. When you call a new function, the new function will also need some stack space. How much depends on the number of variables used by the function. In other words - the more variables, the more stack space will be needed. Since the new variables will overwrite some part of the stack - and thereby overwrite parts of the memory holding the obsolete buffer variable - the amount of destruction of buffer may change with the number of variables in the function call. That is probably what you see.

But notice that the above explanation is system specific. It is not something specified by the C standard. Still, that is how most systems work.

What to do?

Instead of

char buffer[400];

do

char* buffer = malloc(400);
Support Ukraine
  • 42,271
  • 4
  • 38
  • 63
  • Thank you that works too! :) So, the `char* buffer` will exist after the `readStdin` returns, because it is a pointer to the heap and `char buffer[400]` is only local and on the stack? – Bernhard Oct 29 '17 at 18:45
  • @Bernhard_S Exactly.... but notice.... This is not specified by the C standard but it is how most systems works. The C standard does not mention stack and heap at all :-) – Support Ukraine Oct 29 '17 at 18:47
0

As many have stated in the comments, you have scope issues. To avoid this, you need to allocate memory inside readStdin(). Quoting @MOehm from comments

Once you've left readStdin, buffer becomes invalid. Accessing it is undefined behaviour, which you should avoid. (I guess that the problem is not the malloc, but the additional variable output, which happens to occupy part of the space that was previously occupied by buffer, thus corrupting it.)

#define SIZE 400

char* readStdin(void) {
    char buffer[SIZE];
    char *text = NULL

    fgets(buffer, sizeof(buffer), stdin);
    text = malloc(sizeof(char) * (strlen(buffer) + 1));//allocate memory
    strcpy(text, buffer);//and copy the buffer into it.

    int length = strlen(text);
    if (text[length - 1] == '\n') {
        text[length - 1] = '\0';
    }
    return text;
}

And your main function should now look like this:

int main(int argc, char *argv[]) 
{
    char *input = NULL;
    input = readStdin();
    myfunction(input);
    free(input);//must now free it.
    input = NULL;
    return EXIT_SUCCESS;
}

Try making these changes and you will see that all your problems will go away. Now MyFunction will work as expected.

Nik
  • 1,780
  • 1
  • 14
  • 23
-1

I don't understand why you don't get compile errors when doing char* output = malloc(200);: you MUST cast into char*, like char* output = (char*) malloc(200);

ico2k2
  • 61
  • 5
  • [Read This](https://stackoverflow.com/questions/605845/do-i-cast-the-result-of-malloc?rq=1). – Nik Oct 29 '17 at 19:05