169

How can I figure out the size of a file, in bytes?

#include <stdio.h>

unsigned int fsize(char* file){
  //what goes here?
}
hippietrail
  • 15,848
  • 18
  • 99
  • 158
andrewrk
  • 30,272
  • 27
  • 92
  • 113

15 Answers15

176

On Unix-like systems, you can use POSIX system calls: stat on a path, or fstat on an already-open file descriptor (POSIX man page, Linux man page).
(Get a file descriptor from open(2), or fileno(FILE*) on a stdio stream).

Based on NilObject's code:

#include <sys/stat.h>
#include <sys/types.h>

off_t fsize(const char *filename) {
    struct stat st; 

    if (stat(filename, &st) == 0)
        return st.st_size;

    return -1; 
}

Changes:

  • Made the filename argument a const char.
  • Corrected the struct stat definition, which was missing the variable name.
  • Returns -1 on error instead of 0, which would be ambiguous for an empty file. off_t is a signed type so this is possible.

If you want fsize() to print a message on error, you can use this:

#include <sys/stat.h>
#include <sys/types.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>

off_t fsize(const char *filename) {
    struct stat st;

    if (stat(filename, &st) == 0)
        return st.st_size;

    fprintf(stderr, "Cannot determine size of %s: %s\n",
            filename, strerror(errno));

    return -1;
}

On 32-bit systems you should compile this with the option -D_FILE_OFFSET_BITS=64, otherwise off_t will only hold values up to 2 GB. See the "Using LFS" section of Large File Support in Linux for details.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
T Percival
  • 8,526
  • 3
  • 43
  • 43
  • 22
    This is Linux/Unix specific--probably worth pointing that out since the question didn't specify an OS. – Drew Hall Aug 02 '10 at 21:54
  • 1
    You could probably change the return type to ssize_t and cast the size from an off_t without any trouble. It would seem to make more sense to use a ssize_t :-) (Not to be confused with size_t which is unsigned and cannot be used to indicate error.) – T Percival Aug 06 '10 at 17:03
  • 1
    For more portable code, use `fseek` + `ftell` as proposed by Derek. – Ciro Santilli OurBigBook.com Mar 02 '15 at 07:57
  • 14
    *For more portable code, use `fseek` + `ftell` as proposed by Derek.* No. The [C Standard](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf) specifically states that `fseek()` to `SEEK_END` on a binary file is undefined behavior. **7.19.9.2 The `fseek` function** *... A binary stream need not meaningfully support `fseek` calls with a whence value of `SEEK_END`*, and as noted below, which is from footnote 234 on p. 267 of the linked C Standard, and which specifically labels `fseek` to `SEEK_END` in a binary stream as undefined behavior. . – Andrew Henle Apr 06 '16 at 10:54
  • 1
    From [gnu libc manual](https://www.gnu.org/software/libc/manual/html_node/Binary-Streams.html): ... [non-POSIX] systems make a distinction between files containing text and files containing binary data, and the input and output facilities of ISO C provide for this distinction. ... In the GNU C Library, and on all POSIX systems, there is no difference between text streams and binary streams. When you open a stream, you get the same kind of stream regardless of whether you ask for binary. This stream can handle any file content, and has none of the restrictions that text streams sometimes have. – Small Boy Aug 14 '20 at 08:02
82

Don't use int. Files over 2 gigabytes in size are common as dirt these days

Don't use unsigned int. Files over 4 gigabytes in size are common as some slightly-less-common dirt

IIRC the standard library defines off_t as an unsigned 64 bit integer, which is what everyone should be using. We can redefine that to be 128 bits in a few years when we start having 16 exabyte files hanging around.

If you're on windows, you should use GetFileSizeEx - it actually uses a signed 64 bit integer, so they'll start hitting problems with 8 exabyte files. Foolish Microsoft! :-)

Orion Edwards
  • 121,657
  • 64
  • 239
  • 328
  • 3
    I've used compilers where off_t is 32 bits. Granted, this is on embedded systems where 4GB files are less common. Anyways, POSIX also defines off64_t and corresponding methods to add to the confusion. – Aaron Campbell Jul 07 '16 at 20:12
  • 1
    I always love answers that assume Windows and do nothing else but criticize the question. Could you please add something that's POSIX-compliant? – S.S. Anne Apr 27 '19 at 18:52
  • 2
    @JL2210 the accepted answer from Ted Percival shows a posix compliant solution, so I see no sense in repeating the obvious. I (and 70 others) thought that adding the note about windows and not to use signed 32 bit integers to represent file sizes was a value-add on top of that. Cheers – Orion Edwards Apr 28 '19 at 04:32
34

Matt's solution should work, except that it's C++ instead of C, and the initial tell shouldn't be necessary.

unsigned long fsize(char* file)
{
    FILE * f = fopen(file, "r");
    fseek(f, 0, SEEK_END);
    unsigned long len = (unsigned long)ftell(f);
    fclose(f);
    return len;
}

Fixed your brace for you, too. ;)

Update: This isn't really the best solution. It's limited to 4GB files on Windows and it's likely slower than just using a platform-specific call like GetFileSizeEx or stat64.

Derek Park
  • 45,824
  • 15
  • 58
  • 76
  • Yes, you should. However, unless there's a really compelling reason not write platform-specific, though, you should probably just use a platform-specific call rather than the open/seek-end/tell/close pattern. – Derek Park Apr 18 '12 at 04:10
  • 1
    Sorry about the late reply, but I am having a major issue here. It causes the app to hang when accessing restricted files (like password protected or system files). Is there a way to ask the user for a password when needed? – Justin Mar 29 '13 at 03:34
  • @Justin, you should probably open a new question specifically about the issue you're running into, and provide details about the platform you're on, how you're accessing the files, and what the behavior is. – Derek Park Apr 02 '13 at 15:04
  • 5
    Both C99 and C11 return `long int` from `ftell()`. `(unsigned long)` casting does not improve the range as already limited by the function. `ftell()` return -1 on error and that get obfuscated with the cast. Suggest `fsize()` return the same type as `ftell()`. – chux - Reinstate Monica Jan 12 '14 at 22:03
  • I agree. The cast was to match the original prototype in the question. I can't recall why I turned it into unsigned long instead of unsigned int, though. – Derek Park Jan 27 '14 at 22:09
  • 1
    Obviously you wouldn't want to use `int`, that would fail to handle large files even on a 64-bit system where `long` was a 64-bit type. (e.g. most non-Windows 64-bit systems use [an LP64 ABI](https://unix.org/version2/whatsnew/lp64_wp.html)). But really you should use `ftello` which returns an `off_t`, which is 64-bit on every system with large file support. – Peter Cordes Nov 25 '21 at 02:59
16

**Don't do this (why?):

Quoting the C99 standard doc that i found online: "Setting the file position indicator to end-of-file, as with fseek(file, 0, SEEK_END), has undefined behavior for a binary stream (because of possible trailing null characters) or for any stream with state-dependent encoding that does not assuredly end in the initial shift state.**

Change the definition to int so that error messages can be transmitted, and then use fseek() and ftell() to determine the file size.

int fsize(char* file) {
  int size;
  FILE* fh;

  fh = fopen(file, "rb"); //binary mode
  if(fh != NULL){
    if( fseek(fh, 0, SEEK_END) ){
      fclose(fh);
      return -1;
    }

    size = ftell(fh);
    fclose(fh);
    return size;
  }

  return -1; //error
}
EsmaeelE
  • 2,331
  • 6
  • 22
  • 31
andrewrk
  • 30,272
  • 27
  • 92
  • 113
  • 6
    @mezhaka: That CERT report is simply wrong. `fseeko` and `ftello` (or `fseek` and `ftell` if you're stuck without the former and happy with limits on the file sizes you can work with) are the correct way to determine the length of a file. `stat`-based solutions **do not work** on many "files" (such as block devices) and are not portable to non-POSIX-ish systems. – R.. GitHub STOP HELPING ICE Oct 24 '10 at 04:30
  • 2
    This is the only way to get the file size on many non-posix compliant systems (such as my very minimalistic mbed) – Earlz Mar 02 '12 at 23:36
  • 1
    You absolutely do not want to use `int` here. `ftell` returns a signed `long`, which is a 64-bit type on many (but not all) 64-bit systems. It's still only 32-bit on most 32-bit systems, so you need `ftello` with `off_t` to be able to handle large files portably. Despite ISO C choosing not to define the behaviour, most implementations do, so this does work in practice on most systems. – Peter Cordes Nov 25 '21 at 02:53
12

POSIX

The POSIX standard has its own method to get file size.
Include the sys/stat.h header to use the function.

Synopsis

  • Get file statistics using stat(3).
  • Obtain the st_size property.

Examples

Note: It limits the size to 4GB. If not Fat32 filesystem then use the 64bit version!

#include <stdio.h>
#include <sys/stat.h>

int main(int argc, char** argv)
{
    struct stat info;
    stat(argv[1], &info);

    // 'st' is an acronym of 'stat'
    printf("%s: size=%ld\n", argv[1], info.st_size);
}
#include <stdio.h>
#include <sys/stat.h>

int main(int argc, char** argv)
{
    struct stat64 info;
    stat64(argv[1], &info);

    // 'st' is an acronym of 'stat'
    printf("%s: size=%ld\n", argv[1], info.st_size);
}

ANSI C (standard)

The ANSI C doesn't directly provides the way to determine the length of the file.
We'll have to use our mind. For now, we'll use the seek approach!

Synopsis

  • Seek the file to the end using fseek(3).
  • Get the current position using ftell(3).

Example

#include <stdio.h>

int main(int argc, char** argv)
{
    FILE* fp = fopen(argv[1]);
    int f_size;

    fseek(fp, 0, SEEK_END);
    f_size = ftell(fp);
    rewind(fp); // to back to start again

    printf("%s: size=%ld", (unsigned long)f_size);
}

If the file is stdin or a pipe. POSIX, ANSI C won't work.
It will going return 0 if the file is a pipe or stdin.

Opinion: You should use POSIX standard instead. Because, it has 64bit support.

  • 2
    `struct _stat64` and `__stat64()` for _Windows. – Bob Stein Apr 17 '19 at 13:23
  • 1
    The last example is incorrect, `fopen` takes two arguments – M.M Feb 17 '21 at 03:58
  • 2
    In ISO C, the function [`ftell`](https://en.cppreference.com/w/c/io/ftell) is only guaranteed to give you the number of bytes from the beginning of the file when the file is open in binary mode. However, in text mode, the value returned by `ftell` is unspecified and is only meaningful to `fseek`. – Andreas Wenzel Apr 23 '22 at 03:41
4

If you're fine with using the std c library:

#include <sys/stat.h>
off_t fsize(char *file) {
    struct stat filestat;
    if (stat(file, &filestat) == 0) {
        return filestat.st_size;
    }
    return 0;
}
pmttavara
  • 698
  • 7
  • 16
Ecton
  • 10,702
  • 2
  • 35
  • 44
4

And if you're building a Windows app, use the GetFileSizeEx API as CRT file I/O is messy, especially for determining file length, due to peculiarities in file representations on different systems ;)

3

I used this set of code to find the file length.

//opens a file with a file descriptor
FILE * i_file;
i_file = fopen(source, "r");

//gets a long from the file descriptor for fstat
long f_d = fileno(i_file);
struct stat buffer;
fstat(f_d, &buffer);

//stores file size
long file_length = buffer.st_size;
fclose(i_file);
rco16
  • 31
  • 2
  • This solution is using platform-specific functions. It will likely not work on non-POSIX platforms. If you provide a platform-specific answer to a platform-agnostic question, then I suggest that you clearly mark it as such. – Andreas Wenzel Jul 05 '23 at 00:36
3

I found a method using fseek and ftell and a thread with this question with answers that it can't be done in just C in another way.

You could use a portability library like NSPR (the library that powers Firefox).

double-beep
  • 5,031
  • 17
  • 33
  • 41
Nickolay
  • 31,095
  • 13
  • 107
  • 185
1

In plain ISO C, there is only one way to determine the size of a file which is guaranteed to work: To read the entire file from the start, until you encounter end-of-file.

However, this is highly inefficient. If you want a more efficient solution, then you will have to either

  • rely on platform-specific behavior, or
  • revert to platform-specific functions, such as stat on Linux or GetFileSize on Microsoft Windows.

In contrast to what other answers have suggested, the following code is not guaranteed to work:

fseek( fp, 0, SEEK_END );
long size = ftell( fp );

Even if we assume that the data type long is large enough to represent the file size (which is questionable on some platforms, most notably Microsoft Windows), the posted code has the following problems:

The posted code is not guaranteed to work on text streams, because according to §7.21.9.4 ¶2 of the ISO C11 standard, the value of the file position indicator returned by ftell contains unspecified information. Only for binary streams is this value guaranteed to be the number of characters from the beginning of the file. There is no such guarantee for text streams.

The posted code is also not guaranteed to work on binary streams, because according to §7.21.9.2 ¶3 of the ISO C11 standard, binary streams are not required to meaningfully support SEEK_END.

That being said, on most common platforms, the posted code will work, if we assume that the data type long is large enough to represent the size of the file.

However, on Microsoft Windows, the characters \r\n (carriage return followed by line feed) will be translated to \n for text streams (but not for binary streams), so that the file size you get will count \r\n as two bytes, although you are only reading a single character (\n) in text mode. Therefore, the results you get will not be consistent.

On POSIX-based platforms (e.g. Linux), this is not an issue, because on those platforms, there is no difference between text mode and binary mode.

Andreas Wenzel
  • 22,760
  • 4
  • 24
  • 39
  • 1
    Another Windows problem: `long` is only 4 bytes on Windows, meaning `ftell()` will fail on Windows for files larger than 2 GB. – Andrew Henle Jan 06 '23 at 17:56
  • @AndrewHenle: Yes, that is an important point. Meanwhile, I have edited my answer. I believe that I have now addressed your point in my answer. – Andreas Wenzel Jan 07 '23 at 01:41
0

C++ MFC extracted from windows file details, not sure if this is better performing than seek but if it is extracted from metadata I think it is faster because it doesn't need to read the entire file

ULONGLONG GetFileSizeAtt(const wchar_t *wFile)
{
    WIN32_FILE_ATTRIBUTE_DATA fileInfo;
    ULONGLONG FileSize = 0ULL;
    //https://learn.microsoft.com/nl-nl/windows/win32/api/fileapi/nf-fileapi-getfileattributesexa?redirectedfrom=MSDN
    //https://learn.microsoft.com/nl-nl/windows/win32/api/fileapi/ns-fileapi-win32_file_attribute_data?redirectedfrom=MSDN
    if (GetFileAttributesEx(wFile, GetFileExInfoStandard, &fileInfo))
    {
        ULARGE_INTEGER ul;
        ul.HighPart = fileInfo.nFileSizeHigh;
        ul.LowPart = fileInfo.nFileSizeLow;
        FileSize = ul.QuadPart;
    }
    return FileSize;
}
BigChief
  • 1,413
  • 4
  • 24
  • 37
-1

Try this --

fseek(fp, 0, SEEK_END);
unsigned long int file_size = ftell(fp);
rewind(fp);

What this does is first, seek to the end of the file; then, report where the file pointer is. Lastly (this is optional) it rewinds back to the beginning of the file. Note that fp should be a binary stream.

file_size contains the number of bytes the file contains. Note that since (according to climits.h) the unsigned long type is limited to 4294967295 bytes (4 gigabytes) you'll need to find a different variable type if you're likely to deal with files larger than that.

adrian
  • 1,439
  • 1
  • 15
  • 23
  • 3
    How's this different from [Derek's answer](http://stackoverflow.com/a/8247/1275169) from 8 years ago? – P.P Dec 29 '16 at 21:51
  • That's undefined behavior for a binary stream, and for a text stream `ftell` does not return a value representative of the number of bytes that can be read from the file. – Andrew Henle Dec 30 '16 at 01:58
  • *'unsigned long type is limited to 4294967295 bytes (4 gigabytes)'* But when I try unsigned long is 18446744073709551615 bytes (18 exabytes) – user16217248 Oct 16 '22 at 03:09
-2

Here's a simple and clean function that returns the file size.

long get_file_size(char *path)
{
    FILE *fp;
    long size = -1;
    /* Open file for reading */
    fp = fopen(path, "r");
    fseek(fp, 0, SEEK_END);
    size = ftell(fp); 
    fclose(fp);
    return size;
}
Abdessamad Doughri
  • 1,324
  • 2
  • 16
  • 29
  • No, I dislike functions that expect a path. Instead, please make ti exppect a file pointer –  Oct 13 '19 at 01:05
  • `ftell` might not be a byte offset, for text files (you open the file in text mode) – M.M Feb 17 '21 at 03:56
  • 1
    And what happens if you're running on Windows and the file size is 14 GB? – Andrew Henle Mar 04 '21 at 15:31
  • 1
    @AndrewHenle: In that case you'd need to use `ftello` which returns an `off_t`, which can be a 64-bit type even when `long` isn't. I assume `ftello` still has the same problem of in theory being undefined behaviour seeking to the end of a binary stream as you described [in an answer](https://stackoverflow.com/questions/55826796/ftell-fseek-is-different-from-the-actual-readable-data-length-in-a-sys-class-fi/55828019#55828019), but ISO C doesn't provide anything better AFAIK, so for a lot of programs the least-bad thing is to rely on implementations to define this behaviour. – Peter Cordes Nov 25 '21 at 02:56
  • 3
    @PeterCordes [Windows uses `_ftelli64()`](https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/ftell-ftelli64?view=msvc-170) (What?!? Microsoft uses a non-portable function? In a way resulting in vendor lock-in?!!? Say it ain't so!) But if you're relying on implementation-defined behavior, you might as well use an implementation's method to get the file size. Both `fileno()` and `stat()` are supported on Windows, albeit in vendor-lock-in mode as `_fileno()` and `_fstat()`. `#ifdef _WIN32 #define fstat _fstat #define fileno _fileno #endif` is actually the most portable solution. – Andrew Henle Nov 25 '21 at 11:06
  • (cont) Of course it's not quite that easy to write portable code that works on Windows - see the 32/64-bit manuscript at https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/fstat-fstat32-fstat64-fstati64-fstat32i64-fstat64i32?view=msvc-170 – Andrew Henle Nov 25 '21 at 11:10
-2

I have a function that works well with only stdio.h. I like it a lot and it works very well and is pretty concise:

size_t fsize(FILE *File) {
    size_t FSZ;
    fseek(File, 0, 2);
    FSZ = ftell(File);
    rewind(File);
    return FSZ;
}
-3

You can open the file, go to 0 offset relative from the bottom of the file with

#define SEEKBOTTOM   2

fseek(handle, 0, SEEKBOTTOM)  

the value returned from fseek is the size of the file.

I didn't code in C for a long time, but I think it should work.

PabloG
  • 25,761
  • 10
  • 46
  • 59