0

I am experiencing some odd behavior when doing some sockets programming that is confusing me. I get a successful response from the server, and eventually the program fully executes. However it takes approximately a minute for one portion of the logic to execute. This is what it looks like:

int j = 2;
char buffer[BUFFER_SIZE];
char *response;

response = (char*)malloc(BUFFER_SIZE);

while(read(fd, buffer, BUFFER_SIZE - 1) != 0) {
    if(j == 2)
        response = strcpy(response, buffer);
    else
        response = strcat(response, buffer);

    response = (char*)realloc(response, BUFFER_SIZE * j);
    j++;
    bzero(buffer, BUFFER_SIZE);
}

fprintf(stderr, "%s", response);

Why would it take so long for this to execute? Thanks for the look!

  • 2
    Does the result of `read` include a trailing `\0`? Why `BUFFER_SIZE - 1`? – David Ranieri Apr 06 '20 at 18:25
  • 2
    The intermediate buffer here seems... pointless. You can just as easily read directly into `response`, provided the target offset is appropriately used, utilizing an accumulated counter and retaining the results of the `read` call rather than just disposing of it after testing against zero. How much data are we talking about anyway? – WhozCraig Apr 06 '20 at 18:29
  • @David Ranieri That is a good question. I know that the response will end in \r\n but I'm not sure if the raw bytes contain a null terminator. It is worth note that I do the same call to read in an alternate http call that has a deterministic response size (so no need for dynamic allocation) and the read op is very quick (and response comes from same server). Hence I thought something may be off in how I was handling the dynamic allocation. – VirusModulePointer Apr 06 '20 at 18:29
  • 3
    HTTP is built on TCP which means you must completely and correctly handle the results returned by system calls,like read(). If the read is successful, you MUST NOT ignore the result since its the only way to know how many bytes have been loaded into the buffer. @WhozCraig has good advice, use tbe result and don't misuse library calls that require NUL-TERMINATED char arrays. – Martin James Apr 06 '20 at 18:39
  • @WhozCraig that is my intention in the end. I thought it prudent to figure out why I was seeing this behavior at this stage prior to going forward. – VirusModulePointer Apr 06 '20 at 18:39
  • 1
    If on Linux, `strace -tt` can help you find out what is taking so long. – Nate Eldredge Apr 06 '20 at 19:00

2 Answers2

2

I seriously doubt the strcpy/strcat logic is what is tossing your performance, but it isn't needed, nor warranted. The follow continuously appends data directly to the response buffer, which is ever increasing with each iteration. Ideally this would grow geometric (e.g. the buffer space provided for the next read is compounded by the size already-read), but for this example a frame-by-frame solution will be sufficient.

Above all, this utilizes the return value of read for what it is: a trio of error-announcement, end-of-data, or actual count of consumed octets. That is critical to remember when using read. Anything greater than zero means we read something and anything less or equal to zero means no more data (0), or an error state (< 0) on a synchronous socket.

Anyway:

char *response = malloc(BUFFER_SIZE);
if (response)
{
    ssize_t n = 0, size = 0;

    while ((n = read(fd, response + size, BUFFER_SIZE-1)) > 0)
    {
        size += n; // accumulate data just-read

        // make space for the next frame
        void *tmp = realloc(response, size + BUFFER_SIZE);
        if (tmp == NULL)
        {
            // failed to realloc. break now.
            perror("Failed to realloc response buffer");
            break;
        }

        // otherwise, retain new pointer and march on.
        response = tmp;
    }

    // terminate the buffer. there will always be room
    response[size] = 0;

    // do with the response whatever you will.
    fprintf(stderr, "%s", response);
    free(response);
}

Something like that, anyway.

If that doesn't work, then the problem is in the IP stack on the client, the server, or the conduit in between.

WhozCraig
  • 65,258
  • 11
  • 75
  • 141
  • I ended up using something similar in the end. Unfortunately it is still slow and hangs up on the last `read` call but at least it is returning the valid response in its entirety. Thanks WhozCraig – VirusModulePointer Apr 07 '20 at 16:02
0

First, you should make sure that the first strcpy() has a proper string to work with. This can be accomplished by either initializing the buffer to zero first (like you did with the bzero() for subsequent iterations) or let

buffer[return_val_of_read] = '\0';

As it stands, you have possible UB.

If that doesn't fix the problem, I would check that BUFFER_SIZE is appropriately large; if it's something strange like 3 bytes, you might just have too many reallocs()

Side notes:
1) The comments make great points. I would eliminate the intermediate buffer and dump the dependency on C string requirements by allowing read() to read BUFFER_SIZE bytes and checking its return value to know when to stop. This would also let you the rid of the bzero() call.
2) At the moment, if the server sends less than BUFFER_SIZE bytes, you have an unnecessary realloc()
3) Don't cast the result of malloc()

Hello World
  • 305
  • 1
  • 9