0

I know there is a common strategy in C to handle the situation that the input data, for example, text, exceed the original allocated space. That is reallocating with more space.

#include <stdlib.h>
#include <stdio.h>

#define BUF_SIZE 1024

void check_buffer(char *buffer) 
{
    if (!buffer) {
        fprintf(stderr, "lsh: allocation error\n");
        exit(EXIT_FAILURE);
    }
}

char *read_line() 
{
    int bufsize = BUF_SIZE;
    int position = 0;
    char *buffer = malloc(sizeof(char) * bufsize);
    int c;

    check_buffer(buffer);

    while (1) {
        c = getchar();

        if (c == EOF || c == '\n') {
            buffer[position] = '\0';
            return buffer;
        } else {
            buffer[position] = c;
        }
        position++;

        if (position >= bufsize) {
            bufsize += BUF_SIZE; // Or `bufsize *= 2;`?
            buffer = realloc(buffer, bufsize);

            check_buffer(buffer);
        }
    }
}

So what is the better way to expand the original space? bufsize += BUF_SIZE or bufsize *= 2? Which way is more effective?

  • I do just `bufsize += 1`. This _depens_. On system, environment, on usage, on program, on developer. Many libraries handle that differently, ex. see [cs50](https://github.com/cs50/libcs50/blob/develop/src/cs50.c#L174). Ex. in C++ they just `new bufsize+1` -> copy all the data -> `delete old_storage`. – KamilCuk Dec 19 '18 at 09:20
  • 3
    Don't `realloc` with the pointer itself, use a temporary pointer, e.g. `void *tmp = realloc(buffer, bufsize * 2); if (!tmp) { perror ("realloc-buffer"); buffer[position] = '\0'; break; } buffer = tmp; bufsize *= 2;` and keep going. – David C. Rankin Dec 19 '18 at 09:29

2 Answers2

1

Which way is more effective?

This is a VERY ambiguous question. Effective with regard to what? Note that performance is not a good answer, since that is almost as ambiguous as "effective".

You cannot say that one method is better than the other. Both have pros and cons and it depends to a large degree on the actual circumstances and the value of BUF_SIZE.

Go with one or the other. If you experience performance issues, try the other method and try tweaking the value of BUF_SIZE. But before you do that, profile the code to see if the reallocation really is the issue.

A third alternative is somewhere in between. You could do this:

bufsize = bufsize*log(bufsize);
realloc(buffer, bufsize);

A fourth is to use exponential growth until a threshold and then switch:

if(bufsize > THRESHOLD)
    bufsize += LINEAR_INCREASE;
else
    bufsize *= 2;

One situation where pure exponential growth can cause problems is if you already have a very large buffer. Then the next reallocation may fail, or cause other problems.

Also, remember to check that re reallocation (and the initial allocation too) actually worked by checking if the return pointer is NULL. And never assign the buffer to the return value. If the reallocation fails, you lose the current allocation too. You can do like this:

void * ptr = realloc(buffer, bufsize);
if(ptr) buffer = ptr;
else { // Handle the fact that reallocation failed }

If you have very limited amount of memory, it could be a good idea to do something like this:

int increase = LINEAR_INCREASE;
void * ptr;
do {
    ptr = realloc(buffer, increase);
    increase /= 2; // If realloc failed, try with half the increase
} while(!ptr && increase > 0)
if(ptr) buffer = ptr;
else { /* Handle the fact that reallocation failed */ }
klutt
  • 30,332
  • 17
  • 55
  • 95
-1

What happens when you are reallocating 3rdth time and you have already allocated 2048 bytes memory at 2ndth time?

bufsize += BUF_SIZE; //this will be 2048 + 1024 =3072
bufsize *= 2; // this will be 2048*2 = 4096

Where as you clearly want to allocate 3072 bytes of memory.

kiran Biradar
  • 12,700
  • 3
  • 19
  • 44
  • Did I misunderstood the question? – kiran Biradar Dec 19 '18 at 09:24
  • Clearly he wants to multiply the old result and get `1024 * 2 * 2 = 4096` and that's how it's done, ex. [cs50](https://github.com/cs50/libcs50/blob/develop/src/cs50.c#L181). We can find many examples on github for such approach, see [google results](https://www.google.pl/search?q=%22capacity+*%3D+2%22+realloc+github). The question is `What is the better way to expand the original space? bufsize += BUF_SIZE or bufsize *= 2?`. – KamilCuk Dec 19 '18 at 09:24
  • @KamilCuk I always believe memory re-allocation should be additional not exponential. – kiran Biradar Dec 19 '18 at 09:27
  • @kiranBiradar What's your reason? – Shang-Chih Raphael Huang Dec 19 '18 at 09:29
  • Me too ; ) As I usually work in embedded systems with very little memory, allocating 16 or 32 bytes makes _much_ difference to me (if I have `malloc` at all). I do `bufsize += 1`. But that's an opinion. But on some systems `realloc` is so very expensive, that's it's better to allocate very big chunks once in a while. – KamilCuk Dec 19 '18 at 09:30
  • @Shang-ChihRaphaelHuang Consider you are short of memory fourth time and there is only one character to be read, with exponential you will be allocating 8192 bytes where as with additional you will be allocating 4096 bytes. That is again depends on your requirement. What do you want to care memory or performance. – kiran Biradar Dec 19 '18 at 09:35
  • @kiranBiradar So it's a trade-off. I want to know whether there are some rules to guide programmers to make decision. @KamilCuk said right, that depends on the implementation of `realloc`. – Shang-Chih Raphael Huang Dec 19 '18 at 09:46