1

I am trying to read from a pdf file and write into another file where I run to the problem.

In the while loop, fread reads only 589 bytes which is expected to be 1024 for the first time. In the second loop, fread reads 0 bytes.

I am sure that the pdf file is beyond 1024 bytes.

Here is a similar problem. The phenomenon is the same. But I do not use strlen() which causes that problem.

So how can I resolve the problem?

My code is here:

#include <stdio.h>

#define MAXLINE 1024

int main() {
    FILE *fp;
    int read_len;
    char buf2[MAXLINE];
    FILE *fp2;
    fp2 = fopen("test.pdf", "w");
    if ((fp = fopen("LearningSpark.pdf", "r")) == NULL) {
        printf("Open file failed\n");
    }
    while ((read_len = fread(buf2, sizeof(char), MAXLINE, fp)) > 0) {
        int write_length = fwrite(buf2, sizeof(char), read_len, fp2);
        if (write_length < read_len) {
            printf("File write failed\n");
            break;
        }
    }
    return 0;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
LeeDerson
  • 47
  • 1
  • 5
  • 4
    You may try to open in binary mode `rb` instead of `r`. – Damien Mar 02 '22 at 08:13
  • binary mode shouldn't matter when all you're doing is fread/fwrite. The only things I see broken in this code: (1) both e`read_len` and `write_len` should be `size_t`, not `int`. (2) the open-failure on `fp` should exit the program, not just march into oblivion with the upcoming loop on an invalid pointer, and (3) neither of the file pointers are closed before program termination. Fixing all three of those, I cannot reproduce your problem. – WhozCraig Mar 02 '22 at 08:39
  • 2
    @WhozCraig Note that cppreference [fread](https://en.cppreference.com/w/c/io/fread) mentions that binary mode must be used. Of course, they may implicitly consider that other operations than `fread/fwrite` can be used. Did you reproduce the problem before performing the three corrections? I agree with your corrections, but I don't understand why these errrors could cause the problem. – Damien Mar 02 '22 at 09:03
  • It works when I open the file in rb mode. – LeeDerson Mar 02 '22 at 09:59

2 Answers2

4

fopen(filename, "r") is system dependent. See this post on what may happen to the data you read if you are on Windows, for example. Basically it is related to how certain characters are translated on different systems in text mode, ie., \n is "End-of-Line" on Unix-type systems, but on Windows it is \r\n.

Important: On Windows, ASCII char 27 will result in End-Of-File, if reading in text mode, "r", causing the fread() to terminate prematurely.

To read a binary file, use the "rb" specifier. Similarly for "w", as mentioned here, you should use "wb" to write binary data.

mmixLinus
  • 1,646
  • 2
  • 11
  • 16
2

Binary files such as pdf files must be open in binary mode to prevent end of line translation and other text mode handling on legacy systems such as Windows.

Also note that you should abort when fopen() fails and you should close the files.

Here is a modified version:

#include <errno.h>
#include <stdio.h>
#include <string.h>

#define MAXLINE 1024

int main() {
    char buf2[MAXLINE];
    int read_len;
    FILE *fp;
    FILE *fp2;
    if ((fp = fopen("LearningSpark.pdf", "rb")) == NULL) {
        fprintf(stderr, "Open file failed for %s: %s\n", "LearningSpark.pdf", strerror(errno));
        return 1;
    }
    if ((fp2 = fopen("test.pdf", "wb")) == NULL) {
        fprintf(stderr, "Open file failed for %s: %s\n", "test.pdf", strerror(errno));
        fclose(fp);
        return 1;
    }

    while ((read_len = fread(buf2, 1, MAXLINE, fp)) > 0) {
        int write_length = fwrite(buf2, 1, read_len, fp2);
        if (write_length < read_len) {
            fprintf(stderr, "File write failed: %s\n", strerror(errno));
            break;
        }
    }
    fclose(fp);
    fclose(fp2);
    return 0;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189