How do I get the actual filesize on disk in python? (the actual size it takes on the harddrive).
-
You mean rouned up by cluster size? – ruslik Nov 25 '10 at 08:18
-
2Take a look at this question: http://stackoverflow.com/questions/2493172/determine-cluster-size-of-file-system-in-python – Ruel Nov 25 '10 at 08:24
-
@ruslik: It's not that simple. Consider e.g. sparse or compressed files, which can take less space than their size indicates. – Philipp Nov 25 '10 at 09:13
7 Answers
UNIX only:
import os
from collections import namedtuple
_ntuple_diskusage = namedtuple('usage', 'total used free')
def disk_usage(path):
"""Return disk usage statistics about the given path.
Returned valus is a named tuple with attributes 'total', 'used' and
'free', which are the amount of total, used and free space, in bytes.
"""
st = os.statvfs(path)
free = st.f_bavail * st.f_frsize
total = st.f_blocks * st.f_frsize
used = (st.f_blocks - st.f_bfree) * st.f_frsize
return _ntuple_diskusage(total, used, free)
Usage:
>>> disk_usage('/')
usage(total=21378641920, used=7650934784, free=12641718272)
>>>
Edit 1 - also for Windows: https://code.activestate.com/recipes/577972-disk-usage/?in=user-4178764
Edit 2 - this is also available in Python 3.3+: https://docs.python.org/3/library/shutil.html#shutil.disk_usage

- 12,488
- 6
- 68
- 60
Here is the correct way to get a file's size on disk, on platforms where st_blocks
is set:
import os
def size_on_disk(path):
st = os.stat(path)
return st.st_blocks * 512
Other answers that indicate to multiply by os.stat(path).st_blksize
or os.vfsstat(path).f_bsize
are simply incorrect.
The Python documentation for os.stat_result.st_blocks
very clearly states:
st_blocks
Number of 512-byte blocks allocated for file. This may be smaller thanst_size
/512 when the file has holes.
Furthermore, the stat(2)
man page says the same thing:
blkcnt_t st_blocks; /* Number of 512B blocks allocated */

- 132,704
- 33
- 254
- 328
Update 2021-03-26: Previously, my answer rounded the logical size of the file up to be an integer multiple of the block size. This approach only works if the file is stored in a continuous sequence of blocks on disk (or if all the blocks are full except for one). Since this is a special case (though common for small files), I have updated my answer to make it more generally correct. However, note that unfortunately the statvfs
method and the st_blocks
value may not be available on some system (e.g., Windows 10).
Call os.stat(filename).st_blocks to get the number of blocks in the file.
Call os.statvfs(filename).f_bsize to get the filesystem block size.
Then compute the correct size on disk, as follows:
num_blocks = os.stat(filename).st_blocks
block_size = os.statvfs(filename).f_bsize
sizeOnDisk = num_blocks*block_size

- 1,245
- 10
- 29
-
4`((lSize-1)/bSize+1)*bSize)` might be slightly more accurate. Thanks for correcting my ancient and wrong answer. – ephemient Jan 20 '15 at 16:19
-
`Deprecated since version 2.6: The statvfs module has been removed in Python 3.` :-( https://docs.python.org/2/library/statvfs.html – danodonovan Aug 12 '15 at 12:36
-
@danodonovan It looks like the `statvfs` module has been removed in Python 3, but the answer uses the `os` module. As you can see, the [documentation for Python 3](https://docs.python.org/3/library/os.html#os.statvfs) reveals that `os.statvfs` is still around and has even been updated to include new functionality as recently as Python 3.6. – bytesized Jan 10 '17 at 22:11
-
I am having a situation with larger files where both of your formulae are giving me a value that is 1 block (4,096 bytes) smaller than what du gives me. For example, if you create a file using the command `dd if=/dev/zero of=testsize bs=1 count=419472426`. Said another way, the difference between du's results using the --apparent-size option is off by 7,126 instead of 4,096. Note: the value from du's --apparent-size option does match the value obtained using `os.stat(filename).st_size`. – user1748155 Jul 12 '18 at 05:05
-
According to POSIX – https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_stat.h.html – "There is no correlation between values of the st_blocks and st_blksize, and the f_bsize (from
) structure members". So, unless Python is making some stronger guarantee than POSIX does here, the assumption that f_bsize returned by statvfs is the correct units for st_blocks may not always be accurate. – Simon Kissane Jan 28 '23 at 03:30
st = os.stat(…)
du = st.st_blocks * st.st_blksize

- 198,619
- 38
- 280
- 391
-
+1, didn't realise this was in `os.stat`! I was about to refer the questioner to [`win32file.DeviceIoControl`](http://docs.activestate.com/activepython/2.5/pywin32/win32file__DeviceIoControl_meth.html). Don't know why I assumed the OP was on Windows :P – fmark Nov 25 '10 at 08:30
-
"On some Unix systems (such as Linux), the following attributes may also be available: st_blocks (number of blocks allocated for file), st_blksize (filesystem blocksize)..." – i.e. that's not portable, and you should at least catch the exception that is raised when these members aren't available. – Philipp Nov 25 '10 at 09:15
-
12Careful, this is wrong! On Linux, `st.st_blocks` is *always* in units of 512 bytes, while `st.st_blksize` is a filesystem blocksize (typically 4096 bytes). The real usage is `st.st_blocks * 512`. See http://linux.die.net/man/2/stat for details. – Jim Paris Aug 05 '13 at 16:24
-
1No, you're both wrong: st.st_blocks is NOT ALWAYS in units of 512 bytes. On my machine it is in units of 1024 (which is strange indeed). Additionally, the answer is wrong because st_blksize does not return 1024, it returns the FILE I/O block size, e.g., st_blksize returns 65536 on my machine. For example, on my dell laptop running python 2.7.8 on cygwin on Windows 7, I created a 3000Byte files ("dd if=/dev/zero bs=3000 count=1 of=./testfile.txt") and: os.stat("testfile.txt").st_blocks=4; os.stat("./testfile.txt").st_blksize=65536; the logical size is 3000, on disk is 4096. I will answer below – hft Jan 17 '15 at 23:06
-
Can you please update your answer to refer to @hft's answer below? – Miserable Variable Apr 25 '18 at 18:37
Practically 12 years and no answer on how to do this in windows...
Here's how to find the 'Size on disk' in windows via ctypes;
import ctypes
def GetSizeOnDisk(path):
'''https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getcompressedfilesizew'''
filesizehigh = ctypes.c_ulonglong(0) # not sure about this... something about files >4gb
return ctypes.windll.kernel32.GetCompressedFileSizeW(ctypes.c_wchar_p(path),ctypes.pointer(filesizehigh))
'''
>>> os.stat(somecompressedorofflinefile).st_size
943141
>>> GetSizeOnDisk(somecompressedorofflinefile)
671744
>>>
'''

- 51
- 2
-
Thankyou! I was looking all over for this. Curiously, when OneDrive shows the status of a file as "Available when online" your function almost always returns a size of zero, which is what I want. But for some strange reason it sometimes shows the full size, even when the file is available only when online. No idea why. – Michael Aug 10 '22 at 03:43
I'm not certain if this is size on disk, or the logical size:
import os
filename = "/home/tzhx/stuff.wev"
size = os.path.getsize(filename)
If it's not the droid your looking for, you can round it up by dividing by cluster size (as float), then using ceil, then multiplying.

- 5,291
- 15
- 47
- 56
-
when I used getsize() in windows7,python 2.2, I did get the actual space file occupies. In my case, I crave for the just "file size" not "file space".I wonder how can you get just the file size – Allan Ruin Aug 02 '12 at 17:43
To get the disk usage for a given file/folder, you can do the following:
import os
def disk_usage(path):
"""Return cumulative number of bytes for a given path."""
# get total usage of current path
total = os.path.getsize(path)
# if path is dir, collect children
if os.path.isdir(path):
for file_name in os.listdir(path):
child = os.path.join(path, file_name)
# recursively get byte use for children
total += disk_usage(child)
return total
The function recursively collects byte usage for files nested within a given path, and returns the cumulative use for the entire path.
You could also add a print "{path}: {bytes}".format(path, total)
in there if you want the information for each file to print.

- 6,038
- 1
- 32
- 35
-
After running multiple tests, on Windows 7 this returns the real size, not the size on disk. – Steve Byrne Nov 13 '17 at 17:50