-2

In the code below, I am trying to read from a socket and store the results in a file.

What actually happens, is that my client sends a GET request to my server for a file.html. My server finds the file and writes the contents of it to the socket. Lastly my client reads the content from thread_fd and recreates the file.

For some reason the recreated file has less content than the original. I have located the problem to be some lines in the end, that are missing. When I use printf("%s", buffer) inside the while loop everything seems fine in STDOUT but my fprintf misses somewhat 3.000 bytes for a file of 81.000 bytes size.

#define MAXSIZE 1000

int bytes_read, thread_fd;
char buffer[MAXSIZE];
FILE* new_file;

memset(buffer, 0, MAXSIZE);
if((new_file = fopen(path, "wb+")) == NULL)
{
    printf("can not open file \n");
    exit(EXIT_FAILURE);
}
while ((bytes_read = read(thread_fd, buffer, MAXSIZE)) > 0)
{

    fprintf(new_file, "%s", buffer);
    if(bytes_read < MAXSIZE)
        break;
    memset(buffer, 0, MAXSIZE);
}
Darren Col
  • 119
  • 2
  • 9
  • 5
    You should treat what you get from the socket as binary data, not as strings. – Paul Ogilvie May 29 '18 at 12:18
  • ...and don't forget to close the file after the loop. – Paul Ogilvie May 29 '18 at 12:20
  • ...and also only write the number of bytes received – CannedMoose May 29 '18 at 12:26
  • you don't add the `\0` to the end of your string after reading it in – Chris Turner May 29 '18 at 12:27
  • When I am writing to a socket though, it's not necessary to treat my data as binary right? – Darren Col May 29 '18 at 12:27
  • @DarrenCol Paul's point is that `fprintf` will stop writing as soon as it finds a `'\0'` byte, which could be in the middle of binary data. – aschepler May 29 '18 at 12:29
  • Which operating system? Sockets are not part of the C11 standard (but of POSIX)? – Basile Starynkevitch May 29 '18 at 12:50
  • 1
    Use `printf("%.*s", bytes_read, buffer);` to print only the bytes that were read, but note that it will stop printing at a null byte if there is one before the length. Better to use `fwrite()` or even use file descriptor I/O for the output file. The `memset()` calls aren't necessary if you only write what you read. They give you somewhat spurious protection; if the read call returns `MAXSIZE` bytes, the data is still not null-terminated so you would be reading out of bounds of the buffer array and possibly writing spurious data to the file. – Jonathan Leffler May 29 '18 at 13:12

4 Answers4

1

You read binary data from the socket that may or may not contain a \0 byte. When you then fprintf that data the fprintf will stop at the first \0 it encounters. In your case that is 3000 bytes short of the full file. If your file contains no \0 byte the fprintf will simply continue printing the ram contents until it segfaults.

Use write() to write the data back to the file and check for errors. Don't forget to close() the file and check that for errors too.

Goswin von Brederlow
  • 11,875
  • 2
  • 24
  • 42
0

Your code should/could look like:

int readfile(int thread_fd, char *path)
{
    unsigned int bytes_read;
    char buffer[MAXSIZE];
    int new_file;

    if ((new_file = open(path, _O_CREAT|_O_BINARY,_S_IWRITE)) == -1) return -1;

    while ((bytes_read = read(thread_fd, buffer, MAXSIZE)) > 0)
    {
        if (write(new_file, buffer, bytes_read)!= bytes_read) {
            close(new_file);
            return -2;
        }
    }
    close(new_file);
    return 0;
}
Paul Ogilvie
  • 25,048
  • 4
  • 23
  • 41
  • `_open`, `_read`, `_write`, `_close` are not part of the POSIX standard or of the BSD socket one – Basile Starynkevitch May 29 '18 at 12:51
  • @BasileStarynkevitch, I'll replace it with those without underscore, however, the enviroment was not mentioned. See also https://stackoverflow.com/questions/44792526/differenence-among-open-open-and-fopen-with-regard-to-msvc-compiler – Paul Ogilvie May 29 '18 at 13:03
0

There are a few issues with your code that can cause this.

The most likely cause is this :

if(bytes_read < MAXSIZE)
    break;

This ends the loop when read returns less than the requested amount of bytes. This is however perfectly normal behavior, and can happen eg. when not enough bytes are available at the time of the read call (it's reading from a network socket after all). Just let the loop continue as long as read returns a value > 0 (assuming the socket is a blocking socket - if not, you'll also have to check for EAGAIN and EWOULDBLOCK).

Additionally, if the file you're receiving contains binary data, then it's not a good idea to use fprintf with "%s" to write to the target file. This will stop writing as soon as it finds a '\0' byte (which is not uncommon in binary data). Use fwrite instead.

Even if you're receiving text (suggested by the html file extension), it's still not a good idea to use fprintf with "%s", since the received data won't be '\0' terminated.

Sander De Dycker
  • 16,053
  • 1
  • 35
  • 40
  • The `if` and `break` are just redundant. Hardly worth mentioning. – Paul Ogilvie May 29 '18 at 13:06
  • @PaulOgilvie : they're not redundant : the `while` checks for `> 0` (which is correct), while the `if` checks for `< MAXSIZE` (which is incorrect - as explained, `read` is allowed to return a value `< MAXSIZE`) – Sander De Dycker May 29 '18 at 13:09
  • @PaulOgilvie : a blocking read will always return something, sure (which is why the `> 0` check is correct), but it won't necessarily return a full buffer (because there might not yet be a full buffer worth of data available). Remember that the OP is reading from a socket. – Sander De Dycker May 29 '18 at 13:18
0

This worked!

ps: I don't know if I should be doing this, since I am new here, but really there is no reason for negativity. Any question is a good question. Just answer it if you know it. Do not judge it.

#define MAXSIZE 1000

int bytes_read, thread_fd, new_file;
char buffer[MAXSIZE];

memset(buffer, 0, MAXSIZE);

if((new_file = open(path, O_RDONLY | O_WRONLY | O_CREAT)) < 0)
{
    printf("can not open file \n");
    exit(EXIT_FAILURE);
}
while ((bytes_read = read(thread_fd, buffer, MAXSIZE)) > 0)
    write(new_file, buffer, bytes_read);
close(new_file);
Darren Col
  • 119
  • 2
  • 9