0

I'm currently trying to read the full contents of a file on Windows, using C's fread function. This function requires the size of the buffer that is being read into to be passed as an argument. And because I want the whole file to be read, I need to pass in the size of the file in bytes.

I've tried getting the size of a file on Windows though the use of the Win32 API, more specifically using GetFileSizeEx. The below snippet is from an existing Stack Overflow answer.

__int64 GetFileSize(const char* name)
{
    HANDLE hFile = CreateFile(name, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
    if(hFile == INVALID_HANDLE_VALUE)
        return -1; // error condition, could call GetLastError to find out more

    LARGE_INTEGER size;
    if(!GetFileSizeEx(hFile, &size))
    {
        CloseHandle(hFile);
        return -1; // error condition, could call GetLastError to find out more
    }

    CloseHandle(hFile);
    return size.QuadPart;
}

The returned size from this function is bigger than the actual file size. After executing the following code block

FILE* file = fopen(path, "r");
long size = (long)GetFileSize(path);
char* buffer = new char[size + 1];
fread(buffer, 1, size, file);
buffer[size] = '\0';

the buffer contains garbage bytes at the end of it. I've checked by hand, and the returned size is surely bigger than the actual size in bytes.

I've tried the other methods described in the same Stack Overflow answer linked above, but they all result in garbage bytes at the end of the buffer.

  • generic C you seek to the end and get the file position yes? – old_timer Jul 26 '20 at 14:16
  • @old_timer I've tried that also but that seems to be non standard, and it also returns a bigger size than the actual size. Even with "r" and "rb" reading modes. –  Jul 26 '20 at 14:18
  • 1
    Both GetFileSize as well as seeking to the end are supposed to give you an accurate file size. How did you determine your "actual file size"? Maybe that method is wrong. – Paul Ogilvie Jul 26 '20 at 14:21
  • You say "contains garbage bytes at the end". Is that garbage read or garbage from the buffer? Did you check that? Besides `new char[size]` is C++, not C. – Paul Ogilvie Jul 26 '20 at 14:24
  • What is your baseline for file size? – Bayleef Jul 26 '20 at 14:27
  • @PaulOgilvie Yes I'm using C++ but this function is from C's standard library, didn't think it was important to mention. As for the "actual file size" I just saw that the text that was read was the contents of the file + a few random characters at the end. As if they were appended. –  Jul 26 '20 at 14:29
  • I can see you are destroying your last byte (maybe that is intended, or maybe you need a buffer one byte greater). Also you should check how many bytes you actually read with the return value of fread, and check ferror an feof if it doesn't match expected. – Pablo Yaggi Jul 26 '20 at 14:30
  • @PabloYaggi yes this was an issue and it was fixed in the snippet. –  Jul 26 '20 at 14:32
  • Stefan, it is definitely _not_ a good idea to edit your code in the question with suggestions from the comments. It may make that your question can no longer be understood. – Paul Ogilvie Jul 26 '20 at 14:36
  • @PaulOgilvie Oh I apologize. It was a simple error though and definitely not intentional as my code actually contains the +1 in the buffer creation step. –  Jul 26 '20 at 14:40
  • You can find a file size in MS VC without opening it. See [Filename Search Functions](https://learn.microsoft.com/en-us/cpp/c-runtime-library/filename-search-functions?view=vs-2019). – Weather Vane Jul 26 '20 at 14:42

3 Answers3

3

FILE* file = fopen(path, "r"); should be FILE* file = fopen(path, "rb"); If you want an accurate size open the file in binary mode.

On Windows reading a file in text mode causes "\r\n" sequences to be converted to "\n", resulting in the appearance of fewer bytes being read than expected.

john
  • 85,011
  • 4
  • 57
  • 81
  • For some reason using binary mode wasn't working before but now it is. I'm hoping it was an issue on my end and not with the function. Thanks! –  Jul 26 '20 at 14:32
1

The standard way to read file size on any system using only C standard functions make use of fseek() and ftell() function:

#include <stdio.h>
long get_file_len(char *filename)
{
    long int size=0;
    FILE *fp= fopen ( filename , "rb" );
    if (!fp)
        return 0;
    fseek (fp,0,SEEK_END);    //move file pointer to end of file
    size= ftell (fp);
    fclose(fp);
    return size;
}

As variant you can use also lseek():

#include <stdio.h>
long get_file_len(char *filename)
{
    long int size=0;
    FILE *fp= fopen ( filename , "rb" );
    if (!fp)
        return 0;
    size = lseek (fp,0,SEEK_END);    //move file pointer to end of file
    fclose(fp);
    return size;
}
Frankie_C
  • 4,764
  • 1
  • 13
  • 30
0

You should open the file in binary mode and you should use fseek and ftell to get the file size, that is the portable way. That way you get rid of the windows text mode convertions.

FILE* file = fopen(path, "rb");
fseek(file,0,SEEK_END) ; //move to 0 bytes to the end
long size=ftell(file); //get the size (pos at end)
rewind(file); //same as fseek(file,0,SEEK_SET), move the position to the begining

char* buffer = new char[size + 1];
long bytes_read=fread(buffer, 1, size, file);
buffer[bytes_read]=0;
if (bytes_read!=size)
{
// check errors (feof)

}
Pablo Yaggi
  • 1,061
  • 5
  • 14