0

I know this has been asked many many times, but it seems like the occurrence of this issue is slightly different every time.

I have the following C application, that receives a string from a function called word_generator(), then every incoming string is filtered and once the condition is met, the current string is stored in a char array called output_char_buffer.

Since the incoming data has a variable length, the char arrays involved in the process needs to be dynamically resized.

The application seems to be working, but the static analysis tool is complaining with the following message:

warning: V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'output_char_buffer' is lost.
Consider assigning realloc() to a temporary pointer.

Here is the code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *word_generator(int selector)
{
    switch(selector)
    {
        case 0:
            return "gpu-log-02_05_2022_12_37_56_784";
            break;
        case 1:
            return "glsl-debug.txt";
            break;
        case 2:
            return "compiler.log";
            break;
        case 3:
            return "shader.pub";
            break;
        case 4:
            return "fluid-sim-variantA.cache";
            break;
        default:
            printf("ERROR: Request out of range!\n");
    }
    return "";
}


int main() {

    char *output_char_buffer;
    output_char_buffer = NULL;

    // Simulate incoming data.
    for (int i = 0; i < 5; i++)
    {
        printf("Test string[%d]: %s\n", i, word_generator(i));
        
        unsigned long local_buffer_length = strlen(word_generator(i));
        unsigned long input_buffer_length = 0;

        char *input_char_buffer = (char*)malloc(local_buffer_length + 1);
        
        if (input_char_buffer != NULL)
        {
            strcpy(input_char_buffer, word_generator(i));
            input_buffer_length = strlen(input_char_buffer);
        }
        else
        {
            // Clean-up.
            free(input_char_buffer);
            input_char_buffer = NULL;

            printf("ERROR: Failed to allocate char buffer memory!\n");

            // Exit with an error state.
            return 1;
        }
        
        // Verbose debug.
        printf("\tCurrent input buffer (value: %s, length: %lu)\n", input_char_buffer, input_buffer_length);
        
        char key[] = "compiler.log";
        
        // Verbose debug.
        printf("\tCurrent key (value: %s, length: %lu)\n", key, strlen(key));

        if (strcmp(input_char_buffer, key) == 0)
        {
            printf("\t\t__MATCH__\n");
            
            output_char_buffer = (char*)realloc(output_char_buffer, (local_buffer_length + 1));
            
            if (output_char_buffer != NULL)
            {
                strcpy(output_char_buffer, input_char_buffer);
            }
            else
            {
                // Clean-up.
                free(output_char_buffer);
                output_char_buffer = NULL;
                
                printf("ERROR: Failed to fetch char buffer memory!\n");
                
                // Exit with an error state.
                return 1;
            }
        }

        // Clean-up.
        free(input_char_buffer);
        input_char_buffer = NULL;
    }

    // Check the final value of the string container.
    printf("Result: %s\n", output_char_buffer);
    
    // Clean-up and finish.
    free(output_char_buffer);
    output_char_buffer = NULL;
    
    return 0;
}

Output:

Test string[0]: gpu-log-02_05_2022_12_37_56_784
        Current input buffer (value: gpu-log-02_05_2022_12_37_56_784, length: 31)
        Current key (value: compiler.log, length: 12)
Test string[1]: glsl-debug.txt
        Current input buffer (value: glsl-debug.txt, length: 14)
        Current key (value: compiler.log, length: 12)
Test string[2]: compiler.log
        Current input buffer (value: compiler.log, length: 12)
        Current key (value: compiler.log, length: 12)
                __MATCH__
Test string[3]: shader.pub
        Current input buffer (value: shader.pub, length: 10)
        Current key (value: compiler.log, length: 12)
Test string[4]: fluid-sim-variantA.cache
        Current input buffer (value: fluid-sim-variantA.cache, length: 24)
        Current key (value: compiler.log, length: 12)
Result: compiler.log

The line flagged for the issue is this one:

output_char_buffer = (char*)realloc(output_char_buffer, (local_buffer_length + 1));

What would be the safe way to handle this problem?

chqrlie
  • 131,814
  • 10
  • 121
  • 189
mbilyanov
  • 2,315
  • 4
  • 29
  • 49
  • Don't "lose" the original pointer until you know that `realloc` has succeeded. – Scott Hunter May 29 '22 at 18:25
  • I can't see where I'm loosing the original pointer. Probably have been staring at this thing for too long. – mbilyanov May 29 '22 at 18:28
  • 1
    The original pointer was the value of `output_char_buffer`. When `realloc` returns NULL, the NULL is stored in `output_char_buffer`. Where is the original pointer now? I think the error message and its suggestion are pretty clear. – rici May 29 '22 at 18:30
  • @rici: Whether or not the abandonment of the original pointer is a problem would depend upon whether the application is expected to continue operation after a realloc() failure. If a program is going to be terminated as soon as practical after a realloc() failure, the fact that the original allocation won't get freed until the program terminates won't be an issue. Fundamentally, the design of the Standard allocation library focused more on avoiding offering any features that wouldn't be universally supportable, and avoid breaking any existing programs that relied upon internal details of... – supercat May 29 '22 at 18:39
  • The original `output_char_buffer` is not pointing to anything at the beginning. If it fails to `realloc`, then there is nothing to free? – mbilyanov May 29 '22 at 18:39
  • ...existing implementation, than on ensuring that it provided the best blend of semantics and cost for most purposes. Having a realloc() function accept a size_t* and make available the new size of the allocation would have been more useful than having it return null in case of failure, especially for scenarios were code might be able to adjust its behavior according to the amount of usable storage. – supercat May 29 '22 at 18:41
  • 1
    @mbilyanov: yes, but it's in a loop. So there is a second and a third call. As written, the call to `free(output_char_buffer)` in the `else` clause is completely pointless because that clause is only executed if `output_char_buffer` is NULL. But that does not mean there is nothing to free – rici May 29 '22 at 21:28
  • @supercat: certainly there are use cases in which the `malloc` (& friends) interface is not optimal, something which could equally well be said for many features of the standard C library. On the other hand, millions of programmers have successfully used it, and many continue to do so. Looking at other languages' standard libraries, designed afterwards, does not actually give much support to the value of hindsight. You or I might have come up with different designs, but who knows what people would be saying about them 40 years later :-) – rici May 29 '22 at 21:38
  • You are certainly correct that meticulous bookkeeping of allocations is a waste of programmer and execution time if the application has no plans to continue running after allocation failure. But it's necessary if you're writing, say, a multithreaded high-uptime server. (My approach to that environment is to use pool allocators, rather than malloc, but that's just me.) – rici May 29 '22 at 21:42
  • I understand that this might be a duplicate, but I would like to thank you all for the help and the constructive comments :) The whole thread has been very educative. – mbilyanov May 29 '22 at 22:10
  • @rici: Non-Unix programmers in the 1990s generally avoided malloc() family functions in favor of OS functions that would offer better semantics or performance, e.g. allowing programs to query the actual sizes of allocations, or having lower overhead in exchange for requiring that programmers pass free() the size of the allocation being released. The C Standard opts for a "worse of both worlds" approach, requiring that implementations expend the overhead of tracking allocation sizes, but requiring that programs needing that information *also* track it for themselves. – supercat May 30 '22 at 00:06
  • @rici: Code which needs to be robust when hitting memory limits shouldn't be using malloc() and realloc() during allocation, but should instead allocate storage via some mechanism that allows programs to know how much memory they will be able to allocate safely, indicate that allocations may be considered "purgeable" when not locked, or use other such mechanisms the standard library lacks. Code which is using `malloc()` and `realloc()` during operation will in many cases have no way of handling low-memory sitautions robustly, no matter how hard programmers try. – supercat May 30 '22 at 23:14
  • @supercat: it's possible that we could have an interesting chat about programming, despite everything, but this is not the place to do it. – rici May 31 '22 at 00:41
  • @rici: My point is that the amount of effort required to recover gracefully from a realloc failure would generally be better spent avoiding the use of `realloc` in any situations where graceful recovery would be needed. – supercat May 31 '22 at 15:29
  • @supercat: In my (not totally hypothetical) example, for graceful recovery it's sufficient for the thread to release all its resources. Forked servers are easier, of course, because the process can simply die. Following Joe Armstrong's famous dictum, I prefer fork despite the possible inefficiency. But Apache (and others) like to thread. The Apache experience suggests that pool allocation (which can be built on top of malloc) is easier for programmers to manage. – rici May 31 '22 at 15:38
  • Apache adds a layer to memory management which omits `free()`; you can only free an entire pool. I've used that model extensively, and am reasonably convinced that it's workable, although it can lead to excessive string duplication. To avoid that, I tend to use immutable strings, reserving realloc for string builders. So that's probably along the same lines as your comment. But I repeat, this is really not a useful forum to host this discussion. I'm sure a moderator will eventually come by and delete it – rici May 31 '22 at 15:46

1 Answers1

1

There are 2 instances of realloc in the code. In the second one, the original pointer is indeed overwritten:

output_char_buffer = (char*)realloc(output_char_buffer, (local_buffer_length + 1));

You should instead store the return value into a temporary variable so you can free the original pointer in case of failure, thus avoiding a memory leak.

Also note that the cleanup in your code is incomplete: you should free both the input_char_buffer and the output_char_buffer in case it has been allocated during a previous iteration.

Here is a modified version:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

const char *word_generator(int selector) {
    switch (selector) {
    case 0:  return "gpu-log-02_05_2022_12_37_56_784";
    case 1:  return "glsl-debug.txt";
    case 2:  return "compiler.log";
    case 3:  return "shader.pub";
    case 4:  return "fluid-sim-variantA.cache";
    default: printf("ERROR: Request out of range!\n");
             return "";
    }
}

int main() {
    char *output_char_buffer = NULL;

    // Simulate incoming data.
    for (int i = 0; i < 5; i++) {
        const char *word = word_generator(i);
        printf("Test string[%d]: %s\n", i, word);

        size_t local_buffer_length = strlen(word);

        char *input_char_buffer = (char*)malloc(local_buffer_length + 1);
        if (input_char_buffer == NULL) {
            // Clean-up.
            free(output_char_buffer);
            output_char_buffer = NULL;

            printf("ERROR: Failed to allocate char buffer memory!\n");
            // Exit with an error state.
            return 1;
        }

        strcpy(input_char_buffer, word);
        size_t input_buffer_length = strlen(input_char_buffer);

        // Verbose debug.
        printf("\tCurrent input buffer (value: %s, length: %zu)\n",
               input_char_buffer, input_buffer_length);

        char key[] = "compiler.log";

        // Verbose debug.
        printf("\tCurrent key (value: %s, length: %zu)\n", key, strlen(key));

        if (strcmp(input_char_buffer, key) == 0) {
            printf("\t\t__MATCH__\n");

            char *temp = (char *)realloc(output_char_buffer, local_buffer_length + 1);
            if (temp == NULL) {
                // Clean-up.
                free(output_char_buffer);
                output_char_buffer = NULL;
                free(input_char_buffer);
                input_char_buffer = NULL;

                printf("ERROR: Failed to fetch char buffer memory!\n");

                // Exit with an error state.
                return 1;
            }
            output_char_buffer = temp;
            strcpy(output_char_buffer, input_char_buffer);
        }

        // Clean-up.
        free(input_char_buffer);
        input_char_buffer = NULL;
    }

    // Check the final value of the string container.
    printf("Result: %s\n", output_char_buffer);

    // Clean-up and finish.
    free(output_char_buffer);
    output_char_buffer = NULL;
    return 0;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • Yes. You are right. Your solution is working. I was also trying to find a fix and came up with the following solution shared in the gist below, which is similar to yours by the way: https://gist.github.com/mbilyanov/206e4cb4f590ef65ca216f9c6a438d87 In my solution, however, I was still not sure about some of the aspects as it is indicated in the comments. Freeing tmp pointers etc. – mbilyanov May 29 '22 at 20:34
  • 1
    @mbilyanov: you write: *Probably `output_char_buffer` and `tmp_output_char_buffer` are pointing to the same memory location*... Yes they do since you just assigned `output_char_buffer = tmp_output_char_buffer`. Same for the other question: no further clean up is needed. Note that yo can pass a null pointer to `free` without a problem. You might also look at `strdup()` that combines allocation and copying the contents of a C string. – chqrlie May 29 '22 at 20:42