Before I re-invent this particular wheel, has anybody got a nice routine for calculating the size of a directory using Python? It would be very nice if the routine would format the size nicely in Mb/Gb etc.
-
16It would NOT be very nice. You should have one function to calculate the size and a quite independent function (that could be used also with memory sizes, for example) to "format the size nicely in Mb/Gb etc". – John Machin Feb 15 '10 at 02:37
-
20Yes i know but this saves asking two question. – Gary Willoughby Feb 15 '10 at 20:06
-
3The `tree` command on *nix systems does all of this for free. `tree -h -d --du /path/to/dir`. – meh Jul 25 '17 at 18:07
-
2@meh `du -sh /path/to/dir/*` – mrgloom Jun 21 '19 at 11:24
33 Answers
This walks all sub-directories; summing file sizes:
import os
def get_size(start_path = '.'):
total_size = 0
for dirpath, dirnames, filenames in os.walk(start_path):
for f in filenames:
fp = os.path.join(dirpath, f)
# skip if it is symbolic link
if not os.path.islink(fp):
total_size += os.path.getsize(fp)
return total_size
print(get_size(), 'bytes')
And a oneliner for fun using os.listdir (Does not include sub-directories):
import os
sum(os.path.getsize(f) for f in os.listdir('.') if os.path.isfile(f))
Reference:
- os.path.getsize - Gives the size in bytes
- os.walk
- os.path.islink
Updated To use os.path.getsize, this is clearer than using the os.stat().st_size method.
Thanks to ghostdog74 for pointing this out!
os.stat - st_size Gives the size in bytes. Can also be used to get file size and other file related information.
import os
nbytes = sum(d.stat().st_size for d in os.scandir('.') if d.is_file())
Update 2018
If you use Python 3.4 or previous then you may consider using the more efficient walk
method provided by the third-party scandir
package. In Python 3.5 and later, this package has been incorporated into the standard library and os.walk
has received the corresponding increase in performance.
Update 2019
Recently I've been using pathlib
more and more, here's a pathlib
solution:
from pathlib import Path
root_directory = Path('.')
sum(f.stat().st_size for f in root_directory.glob('**/*') if f.is_file())

- 42,176
- 24
- 124
- 155
-
19+1 but the oneliner doesn't return a valid result because it is not recursive – luc Sep 08 '09 at 10:19
-
2
-
43For real fun you can do a recursive size in one line: sum( os.path.getsize(os.path.join(dirpath,filename)) for dirpath, dirnames, filenames in os.walk( PATH ) for filename in filenames ) – driax Aug 29 '10 at 20:02
-
2But you have to use `st_size` if you want to not follow symlinks, as you should then use `lstat`. – asmeurer Mar 18 '14 at 20:46
-
4Warning! this is not the same as 'du -sb'. See the answer by Samuel Lampa! Your code ignores the size of the folder used to store FAT. – Yauhen Yakimovich Jan 23 '15 at 13:09
-
2The code will give incorrect or at least unexpected results in many situations. E.g `os.path.getsize('/proc/kcore')`. It returns 128Tb on my system. So is the size of my `/proc` directory >128Tb? It depends. :) – Björn Lindqvist Oct 12 '15 at 23:43
-
The values calculated are slightly different to the values calculated by `du`. As it seems the size of directories is not included. But just adding `total_size += os.path.getsize(dirpath)` doesn't bring the same result as `du` … For the same directory `du -s .` gives `143428`, while `get_size` returns `146849731` (`146849731/1024 = 143407.94`). – white_gecko Jul 26 '17 at 11:06
-
This will not give same number as du does. And it is too slow, when there are many files. – Wang Aug 09 '17 at 09:14
-
This method will include files that are symbolically linked (`ln -s`). If you want to exclude them, as they don't really exist in the directory, then you can add an `if not os.path.islink(fp)` check before adding to `total_size`. – freethebees Feb 27 '18 at 09:41
-
1One liner than excludes symlinks `sum([os.path.getsize(fp) for fp in (os.path.join(dirpath, f) for dirpath, dirnames, filenames in os.walk(START_PATH) for f in filenames) if not os.path.islink(fp)])` ... pretty hard to read! – freethebees Feb 27 '18 at 09:52
-
-
You can include it if you want, but I would say that a _link_ to a file is not necessarily contained in the target directory. – monkut Sep 26 '19 at 05:20
-
@LMB No. That gives you the stats for the whole disk corresponding to that directory, not of that directory alone. – Asclepius May 23 '20 at 16:08
-
This solution is incomplete and will not work for hard links! For example if two files file1 and file2 are hard linked to same inode, this code will double count. To avoid double counting hard link, you'll need to keep track of file inodes and skip files with inodes already passed. – Splaty Jun 15 '20 at 13:56
Some of the approaches suggested so far implement a recursion, others employ a shell or will not produce neatly formatted results. When your code is one-off for Linux platforms, you can get formatting as usual, recursion included, as a one-liner. Except for the print
in the last line, it will work for current versions of python2
and python3
:
du.py
-----
#!/usr/bin/python3
import subprocess
def du(path):
"""disk usage in human readable format (e.g. '2,1GB')"""
return subprocess.check_output(['du','-sh', path]).split()[0].decode('utf-8')
if __name__ == "__main__":
print(du('.'))
is simple, efficient and will work for files and multilevel directories:
$ chmod 750 du.py
$ ./du.py
2,9M

- 57,944
- 17
- 167
- 143

- 3,910
- 2
- 24
- 36
-
24Python, being cross-platform in nature, should probably shy away from this – Jonathan Sep 11 '15 at 23:48
-
15Thanks for these remarks. I added some caveat regarding platform dependency to the answer. However, much of Python code if one-off scripting. Such code should not come with functional limitations, lengthy and error-prone passages, or uncommon results in edge cases, just for the sake of a portability _beyond any need_. It's, as always, a trade-off, and it's in the responsibility of the developer to choose wisely ;) – flaschbier Sep 18 '15 at 04:51
-
11
-
5It is probably wise to add the '-x' option to the du command in order to confine the search to the filesystem. In other words, use ['du', '-shx', path] instead. – Keith Hanlan Sep 27 '17 at 20:35
Using pathlib
I came up with this one-liner to get the size of a folder:
sum(file.stat().st_size for file in Path(folder).rglob('*'))
And this is what I came up with for a nicely formatted output:
from pathlib import Path
def get_folder_size(folder):
return ByteSize(sum(file.stat().st_size for file in Path(folder).rglob('*')))
class ByteSize(int):
_KB = 1024
_suffixes = 'B', 'KB', 'MB', 'GB', 'PB'
def __new__(cls, *args, **kwargs):
return super().__new__(cls, *args, **kwargs)
def __init__(self, *args, **kwargs):
self.bytes = self.B = int(self)
self.kilobytes = self.KB = self / self._KB**1
self.megabytes = self.MB = self / self._KB**2
self.gigabytes = self.GB = self / self._KB**3
self.petabytes = self.PB = self / self._KB**4
*suffixes, last = self._suffixes
suffix = next((
suffix
for suffix in suffixes
if 1 < getattr(self, suffix) < self._KB
), last)
self.readable = suffix, getattr(self, suffix)
super().__init__()
def __str__(self):
return self.__format__('.2f')
def __repr__(self):
return '{}({})'.format(self.__class__.__name__, super().__repr__())
def __format__(self, format_spec):
suffix, val = self.readable
return '{val:{fmt}} {suf}'.format(val=val, fmt=format_spec, suf=suffix)
def __sub__(self, other):
return self.__class__(super().__sub__(other))
def __add__(self, other):
return self.__class__(super().__add__(other))
def __mul__(self, other):
return self.__class__(super().__mul__(other))
def __rsub__(self, other):
return self.__class__(super().__sub__(other))
def __radd__(self, other):
return self.__class__(super().__add__(other))
def __rmul__(self, other):
return self.__class__(super().__rmul__(other))
Usage:
>>> size = get_folder_size("c:/users/tdavis/downloads")
>>> print(size)
5.81 GB
>>> size.GB
5.810891855508089
>>> size.gigabytes
5.810891855508089
>>> size.PB
0.005674699077644618
>>> size.MB
5950.353260040283
>>> size
ByteSize(6239397620)
I also came across this question, which has some more compact and probably more performant strategies for printing file sizes.

- 511
- 5
- 8
-
will fail if there are invalid symlinks: FileNotFoundError: [Errno 2] No such file or directory – Felipe Valdes Mar 07 '21 at 03:38
-
Would adding `if file.exists()` to the comprehension fix that? I guess `Path.lstat` would work too, but I think it would inflate the size of the result by double counting symlinks. – Terry Davis Mar 08 '21 at 21:26
-
I'd suggest using `KiB`, `MiB`, etc. to make clearer the fact that these are base-2 units and not base-10 (i.e. 1024bytes = 1KiB, 1000bytes = 1KB) – TheTechRobo the Nerd Aug 09 '23 at 02:35
Here is a recursive function (it recursively sums up the size of all subfolders and their respective files) which returns exactly the same bytes as when running "du -sb ." in linux (where the "." means "the current folder"):
import os
def getFolderSize(folder):
total_size = os.path.getsize(folder)
for item in os.listdir(folder):
itempath = os.path.join(folder, item)
if os.path.isfile(itempath):
total_size += os.path.getsize(itempath)
elif os.path.isdir(itempath):
total_size += getFolderSize(itempath)
return total_size
print "Size: " + str(getFolderSize("."))

- 4,336
- 5
- 42
- 63
-
3This function calculates the symlink's size too - if you want to skip the symlinks, you have to check that's not that: if os.path.isfile(itempath) and os.path.islink(itempath) and elif os.path.isdir(itempath) and os.path.islink(itempath). – airween Aug 23 '15 at 19:02
-
It doesn't take cluster size into account (I used it on the SD card formatted in FAT32 with cluster size 32K). – Alexandr Zarubkin Apr 05 '23 at 13:11
Python 3.5 recursive folder size using os.scandir
def folder_size(path='.'):
total = 0
for entry in os.scandir(path):
if entry.is_file():
total += entry.stat().st_size
elif entry.is_dir():
total += folder_size(entry.path)
return total

- 4,154
- 2
- 32
- 52
-
3Python 3 one-liner method if not worried about recursive-ness `sum([entry.stat().st_size for entry in os.scandir(file)])`. Note output is in bytes, /1024 to get KB and /(1024*1024) to get MB. – weiji14 Oct 31 '17 at 21:43
-
7@weiji14 Lose the brackets, i.e., `sum(entry.stat().st_size for entry in os.scandir(file))`. No reason to make a list, because `sum` takes iterators as well. – Vedran Šego Nov 17 '17 at 10:16
-
This answer is by far the most efficient on Windows, because `DirEntry.stat()` [does not require an additional system call per file](https://docs.python.org/3/library/os.html#os.DirEntry.stat). – Bart Robinson Nov 24 '20 at 00:21
for python3.5+
from pathlib import Path
def get_size(folder: str) -> int:
return sum(p.stat().st_size for p in Path(folder).rglob('*'))
Usage::
In [6]: get_size('/etc/not-exist-path')
Out[6]: 0
In [7]: get_size('.')
Out[7]: 12038689
In [8]: def filesize(size: int) -> str:
...: for unit in ("B", "K", "M", "G", "T"):
...: if size < 1024:
...: break
...: size /= 1024
...: return f"{size:.1f}{unit}"
...:
In [9]: filesize(get_size('.'))
Out[9]: '11.5M'

- 5,065
- 2
- 17
- 30
monknut answer is good but it fails on broken symlink, so you also have to check if this path really exists
if os.path.exists(fp):
total_size += os.stat(fp).st_size

- 1,090
- 1
- 12
- 21
The accepted answer doesn't take into account hard or soft links, and would count those files twice. You'd want to keep track of which inodes you've seen, and not add the size for those files.
import os
def get_size(start_path='.'):
total_size = 0
seen = {}
for dirpath, dirnames, filenames in os.walk(start_path):
for f in filenames:
fp = os.path.join(dirpath, f)
try:
stat = os.stat(fp)
except OSError:
continue
try:
seen[stat.st_ino]
except KeyError:
seen[stat.st_ino] = True
else:
continue
total_size += stat.st_size
return total_size
print get_size()

- 1,657
- 1
- 13
- 20
-
5Consider using `os.lstat` (rather than `os.stat`), which avoids following symbolic links: [docs.python.org/2/library/os.html#os.lstat](http://docs.python.org/2/library/os.html#os.lstat) – Peter Briggs Jan 30 '14 at 11:22
a recursive one-liner:
def getFolderSize(p):
from functools import partial
prepend = partial(os.path.join, p)
return sum([(os.path.getsize(f) if os.path.isfile(f) else getFolderSize(f)) for f in map(prepend, os.listdir(p))])
-
1It's not one liner though. However, it calculates recursively folder size(even if folder has multiple folders inside) in bytes and give correct value. – Venkatesh Dec 18 '14 at 19:57
-
-
Chris' answer is good but could be made more idiomatic by using a set to check for seen directories, which also avoids using an exception for control flow:
def directory_size(path):
total_size = 0
seen = set()
for dirpath, dirnames, filenames in os.walk(path):
for f in filenames:
fp = os.path.join(dirpath, f)
try:
stat = os.stat(fp)
except OSError:
continue
if stat.st_ino in seen:
continue
seen.add(stat.st_ino)
total_size += stat.st_size
return total_size # size in bytes

- 324
- 4
- 3
-
2Chris' answer also doesn't take into account symlinks nor the sizes of directories themselves. I've edited your answer accordingly, the output of the fixed function is now identical to `df -sb`. – Creshal Dec 11 '13 at 13:21
-
A little late to the party but in one line provided that you have glob2 and humanize installed. Note that in Python 3, the default iglob
has a recursive mode. How to modify the code for Python 3 is left as a trivial exercise for the reader.
>>> import os
>>> from humanize import naturalsize
>>> from glob2 import iglob
>>> naturalsize(sum(os.path.getsize(x) for x in iglob('/var/**'))))
'546.2 MB'

- 17,269
- 27
- 101
- 156
-
3Starting with Python 3.5, the built-in `glob` supports recursion. You can use: `glob.glob('/var/**', recursive=True)` – adzenith Nov 08 '18 at 14:52
For the second part of the question
def human(size):
B = "B"
KB = "KB"
MB = "MB"
GB = "GB"
TB = "TB"
UNITS = [B, KB, MB, GB, TB]
HUMANFMT = "%f %s"
HUMANRADIX = 1024.
for u in UNITS[:-1]:
if size < HUMANRADIX : return HUMANFMT % (size, u)
size /= HUMANRADIX
return HUMANFMT % (size, UNITS[-1])

- 5,334
- 3
- 21
- 29
-
Just use the package humanize. this code has been re-written far too many times. see other StackOverflow questions for all of the nuances around implementing this yourself. – ejkitchen Oct 29 '22 at 18:22
Get directory size
Properties of the solution:
- returns both: the apparent size (number of bytes in the file) and the actual disk space the files uses.
- counts hard linked files only once
- counts symlinks the same way
du
does - does not use recursion
- uses
st.st_blocks
for disk space used, thus works only on Unix-like systems
The code:
import os
def du(path):
if os.path.islink(path):
return (os.lstat(path).st_size, 0)
if os.path.isfile(path):
st = os.lstat(path)
return (st.st_size, st.st_blocks * 512)
apparent_total_bytes = 0
total_bytes = 0
have = []
for dirpath, dirnames, filenames in os.walk(path):
apparent_total_bytes += os.lstat(dirpath).st_size
total_bytes += os.lstat(dirpath).st_blocks * 512
for f in filenames:
fp = os.path.join(dirpath, f)
if os.path.islink(fp):
apparent_total_bytes += os.lstat(fp).st_size
continue
st = os.lstat(fp)
if st.st_ino in have:
continue # skip hardlinks which were already counted
have.append(st.st_ino)
apparent_total_bytes += st.st_size
total_bytes += st.st_blocks * 512
for d in dirnames:
dp = os.path.join(dirpath, d)
if os.path.islink(dp):
apparent_total_bytes += os.lstat(dp).st_size
return (apparent_total_bytes, total_bytes)
Example usage:
>>> du('/lib')
(236425839, 244363264)
$ du -sb /lib
236425839 /lib
$ du -sB1 /lib
244363264 /lib
Human readable file size
Properties of the solution:
- Supports up to Yottabytes
- Supports SI Units or IEC Units
- Support custom suffixes
The code:
def humanized_size(num, suffix='B', si=False):
if si:
units = ['','K','M','G','T','P','E','Z']
last_unit = 'Y'
div = 1000.0
else:
units = ['','Ki','Mi','Gi','Ti','Pi','Ei','Zi']
last_unit = 'Yi'
div = 1024.0
for unit in units:
if abs(num) < div:
return "%3.1f%s%s" % (num, unit, suffix)
num /= div
return "%.1f%s%s" % (num, last_unit, suffix)
Example usage:
>>> humanized_size(236425839)
'225.5MiB'
>>> humanized_size(236425839, si=True)
'236.4MB'
>>> humanized_size(236425839, si=True, suffix='')
'236.4M'

- 1,597
- 1
- 6
- 15
-
1The total_bytes calculation was what I was looking for! No other solution here calculates that. The du -sh
command infact gives me total bytes and not the apparent_total_bytes. Thanks! – Ankit Arora Mar 29 '22 at 11:01
You can do something like this :
import commands
size = commands.getoutput('du -sh /path/').split()[0]
in this case I have not tested the result before returning it, if you want you can check it with commands.getstatusoutput.

- 6,873
- 3
- 35
- 45

- 6,404
- 8
- 39
- 56
-
how's the performance compared to use `os.walk` to check sub folder size recursively? – TomSawyer Sep 06 '19 at 21:31
One-liner you say... Here is a one liner:
sum([sum(map(lambda fname: os.path.getsize(os.path.join(directory, fname)), files)) for directory, folders, files in os.walk(path)])
Although I would probably split it out and it performs no checks.
To convert to kb see Reusable library to get human readable version of file size? and work it in

- 1
- 1

- 106
- 6
for getting the size of one file, there is os.path.getsize()
>>> import os
>>> os.path.getsize("/path/file")
35L
its reported in bytes.

- 327,991
- 56
- 259
- 343
The following script prints directory size of all sub-directories for the specified directory. It also tries to benefit (if possible) from caching the calls of a recursive functions. If an argument is omitted, the script will work in the current directory. The output is sorted by the directory size from biggest to smallest ones. So you can adapt it for your needs.
PS i've used recipe 578019 for showing directory size in human-friendly format (http://code.activestate.com/recipes/578019/)
from __future__ import print_function
import os
import sys
import operator
def null_decorator(ob):
return ob
if sys.version_info >= (3,2,0):
import functools
my_cache_decorator = functools.lru_cache(maxsize=4096)
else:
my_cache_decorator = null_decorator
start_dir = os.path.normpath(os.path.abspath(sys.argv[1])) if len(sys.argv) > 1 else '.'
@my_cache_decorator
def get_dir_size(start_path = '.'):
total_size = 0
if 'scandir' in dir(os):
# using fast 'os.scandir' method (new in version 3.5)
for entry in os.scandir(start_path):
if entry.is_dir(follow_symlinks = False):
total_size += get_dir_size(entry.path)
elif entry.is_file(follow_symlinks = False):
total_size += entry.stat().st_size
else:
# using slow, but compatible 'os.listdir' method
for entry in os.listdir(start_path):
full_path = os.path.abspath(os.path.join(start_path, entry))
if os.path.isdir(full_path):
total_size += get_dir_size(full_path)
elif os.path.isfile(full_path):
total_size += os.path.getsize(full_path)
return total_size
def get_dir_size_walk(start_path = '.'):
total_size = 0
for dirpath, dirnames, filenames in os.walk(start_path):
for f in filenames:
fp = os.path.join(dirpath, f)
total_size += os.path.getsize(fp)
return total_size
def bytes2human(n, format='%(value).0f%(symbol)s', symbols='customary'):
"""
(c) http://code.activestate.com/recipes/578019/
Convert n bytes into a human readable string based on format.
symbols can be either "customary", "customary_ext", "iec" or "iec_ext",
see: http://goo.gl/kTQMs
>>> bytes2human(0)
'0.0 B'
>>> bytes2human(0.9)
'0.0 B'
>>> bytes2human(1)
'1.0 B'
>>> bytes2human(1.9)
'1.0 B'
>>> bytes2human(1024)
'1.0 K'
>>> bytes2human(1048576)
'1.0 M'
>>> bytes2human(1099511627776127398123789121)
'909.5 Y'
>>> bytes2human(9856, symbols="customary")
'9.6 K'
>>> bytes2human(9856, symbols="customary_ext")
'9.6 kilo'
>>> bytes2human(9856, symbols="iec")
'9.6 Ki'
>>> bytes2human(9856, symbols="iec_ext")
'9.6 kibi'
>>> bytes2human(10000, "%(value).1f %(symbol)s/sec")
'9.8 K/sec'
>>> # precision can be adjusted by playing with %f operator
>>> bytes2human(10000, format="%(value).5f %(symbol)s")
'9.76562 K'
"""
SYMBOLS = {
'customary' : ('B', 'K', 'M', 'G', 'T', 'P', 'E', 'Z', 'Y'),
'customary_ext' : ('byte', 'kilo', 'mega', 'giga', 'tera', 'peta', 'exa',
'zetta', 'iotta'),
'iec' : ('Bi', 'Ki', 'Mi', 'Gi', 'Ti', 'Pi', 'Ei', 'Zi', 'Yi'),
'iec_ext' : ('byte', 'kibi', 'mebi', 'gibi', 'tebi', 'pebi', 'exbi',
'zebi', 'yobi'),
}
n = int(n)
if n < 0:
raise ValueError("n < 0")
symbols = SYMBOLS[symbols]
prefix = {}
for i, s in enumerate(symbols[1:]):
prefix[s] = 1 << (i+1)*10
for symbol in reversed(symbols[1:]):
if n >= prefix[symbol]:
value = float(n) / prefix[symbol]
return format % locals()
return format % dict(symbol=symbols[0], value=n)
############################################################
###
### main ()
###
############################################################
if __name__ == '__main__':
dir_tree = {}
### version, that uses 'slow' [os.walk method]
#get_size = get_dir_size_walk
### this recursive version can benefit from caching the function calls (functools.lru_cache)
get_size = get_dir_size
for root, dirs, files in os.walk(start_dir):
for d in dirs:
dir_path = os.path.join(root, d)
if os.path.isdir(dir_path):
dir_tree[dir_path] = get_size(dir_path)
for d, size in sorted(dir_tree.items(), key=operator.itemgetter(1), reverse=True):
print('%s\t%s' %(bytes2human(size, format='%(value).2f%(symbol)s'), d))
print('-' * 80)
if sys.version_info >= (3,2,0):
print(get_dir_size.cache_info())
Sample output:
37.61M .\subdir_b
2.18M .\subdir_a
2.17M .\subdir_a\subdir_a_2
4.41K .\subdir_a\subdir_a_1
----------------------------------------------------------
CacheInfo(hits=2, misses=4, maxsize=4096, currsize=4)
EDIT: moved null_decorator above, as user2233949 recommended

- 205,989
- 36
- 386
- 419
-
Your script works well, but you need to move the null_decorator function above the 'if sys.version_info >= ...' line. Otherwise you'll get a 'null_decorator' is not defined exception. Works great after that though. – user2233949 Feb 02 '16 at 23:26
-
@user2233949, thank you! I'me modified the code correspondingly. – MaxU - stand with Ukraine Feb 02 '16 at 23:51
use library sh: the module du
does it:
pip install sh
import sh
print( sh.du("-s", ".") )
91154728 .
if you want to pass asterix, use glob
as described here.
to convert the values in human readables, use humanize:
pip install humanize
import humanize
print( humanize.naturalsize( 91157384 ) )
91.2 MB
For what it's worth... the tree command does all of this for free:
tree -h --du /path/to/dir # files and dirs
tree -h -d --du /path/to/dir # dirs only
I love Python, but by far the simplest solution to the problem requires no new code.

- 2,591
- 4
- 20
- 33
It is handy:
import os
import stat
size = 0
path_ = ""
def calculate(path=os.environ["SYSTEMROOT"]):
global size, path_
size = 0
path_ = path
for x, y, z in os.walk(path):
for i in z:
size += os.path.getsize(x + os.sep + i)
def cevir(x):
global path_
print(path_, x, "Byte")
print(path_, x/1024, "Kilobyte")
print(path_, x/1048576, "Megabyte")
print(path_, x/1073741824, "Gigabyte")
calculate("C:\Users\Jundullah\Desktop")
cevir(size)
Output:
C:\Users\Jundullah\Desktop 87874712211 Byte
C:\Users\Jundullah\Desktop 85815148.64355469 Kilobyte
C:\Users\Jundullah\Desktop 83803.85609722137 Megabyte
C:\Users\Jundullah\Desktop 81.83970321994275 Gigabyte
Here is a one liner that does it recursively (recursive option available as of Python 3.5):
import os
import glob
print(sum(os.path.getsize(f) for f in glob.glob('**', recursive=True) if os.path.isfile(f))/(1024*1024))

- 1,647
- 13
- 17
def recursive_dir_size(path):
size = 0
for x in os.listdir(path):
if not os.path.isdir(os.path.join(path,x)):
size += os.stat(os.path.join(path,x)).st_size
else:
size += recursive_dir_size(os.path.join(path,x))
return size
I wrote this function which gives me accurate overall size of a directory, i tried other for loop solutions with os.walk but i don't know why the end result was always less than the actual size (on ubuntu 18 env). I must have done something wrong but who cares wrote this one works perfectly fine.

- 53
- 13

- 21
- 3
This script tells you which file is the biggest in the CWD and also tells you in which folder the file is. This script works for me on win8 and python 3.3.3 shell
import os
folder = os.getcwd()
number = 0
string = ""
for root, dirs, files in os.walk(folder):
for file in files:
pathname = os.path.join(root,file)
## print (pathname)
## print (os.path.getsize(pathname)/1024/1024)
if number < os.path.getsize(pathname):
number = os.path.getsize(pathname)
string = pathname
print(string)
print()
print(number)
print("Number in bytes")

- 2,676
- 1
- 19
- 34

- 21
- 3
I'm using python 2.7.13 with scandir and here's my one-liner recursive function to get the total size of a folder:
from scandir import scandir
def getTotFldrSize(path):
return sum([s.stat(follow_symlinks=False).st_size for s in scandir(path) if s.is_file(follow_symlinks=False)]) + \
+ sum([getTotFldrSize(s.path) for s in scandir(path) if s.is_dir(follow_symlinks=False)])
>>> print getTotFldrSize('.')
1203245680

- 388
- 1
- 7
When size of the sub-directories is computed, it should update its parent's folder size and this will go on till it reaches the root parent.
The following function computes the size of the folder and all its sub-folders.
import os
def folder_size(path):
parent = {} # path to parent path mapper
folder_size = {} # storing the size of directories
folder = os.path.realpath(path)
for root, _, filenames in os.walk(folder):
if root == folder:
parent[root] = -1 # the root folder will not have any parent
folder_size[root] = 0.0 # intializing the size to 0
elif root not in parent:
immediate_parent_path = os.path.dirname(root) # extract the immediate parent of the subdirectory
parent[root] = immediate_parent_path # store the parent of the subdirectory
folder_size[root] = 0.0 # initialize the size to 0
total_size = 0
for filename in filenames:
filepath = os.path.join(root, filename)
total_size += os.stat(filepath).st_size # computing the size of the files under the directory
folder_size[root] = total_size # store the updated size
temp_path = root # for subdirectories, we need to update the size of the parent till the root parent
while parent[temp_path] != -1:
folder_size[parent[temp_path]] += total_size
temp_path = parent[temp_path]
return folder_size[folder]/1000000.0

- 41
- 4
A solution that works on Python 3.6 using pathlib.
from pathlib import Path
sum([f.stat().st_size for f in Path("path").glob("**/*")])

- 2,544
- 3
- 25
- 31

- 11
- 4
Python 3.6+ recursive folder/file size using os.scandir
. As powerful as in the answer by @blakev, but shorter and in EAFP python style.
import os
def size(path, *, follow_symlinks=False):
try:
with os.scandir(path) as it:
return sum(size(entry, follow_symlinks=follow_symlinks) for entry in it)
except NotADirectoryError:
return os.stat(path, follow_symlinks=follow_symlinks).st_size

- 1,280
- 13
- 13
du
does not follow symlinks by default. No answer here make use of follow_symlinks=False
.
Here is an implementation which follows default behavior of du:
def du(path) -> int:
total = 0
for entry in os.scandir(path):
if entry.is_file(follow_symlinks=False):
total += entry.stat().st_size
elif entry.is_dir(follow_symlinks=False):
total += du(entry.path)
return total
Test:
class Test(unittest.TestCase):
def test_du(self):
root = '/tmp/du_test'
subprocess.run(['rm', '-rf', root])
test_utils.mkdir(root)
test_utils.create_file(root, 'A', '1M')
test_utils.create_file(root, 'B', '1M')
sub = '/'.join([root, 'sub'])
test_utils.mkdir(sub)
test_utils.create_file(sub, 'C', '1M')
test_utils.create_file(sub, 'D', '1M')
subprocess.run(['ln', '-s', '/tmp', '/'.join([root, 'link']), ])
self.assertEqual(4 << 20, util.du(root))

- 3,607
- 1
- 31
- 36
import os
def get_size(path = os.getcwd()):
print("Calculating Size: ",path)
total_size = 0
#if path is directory--
if os.path.isdir(path):
print("Path type : Directory/Folder")
for dirpath, dirnames, filenames in os.walk(path):
for f in filenames:
fp = os.path.join(dirpath, f)
# skip if it is symbolic link
if not os.path.islink(fp):
total_size += os.path.getsize(fp)
#if path is a file---
elif os.path.isfile(path):
print("Path type : File")
total_size=os.path.getsize(path)
else:
print("Path Type : Special File (Socket, FIFO, Device File)" )
total_size=0
bytesize=total_size
print(bytesize, 'bytes')
print(bytesize/(1024), 'kilobytes')
print(bytesize/(1024*1024), 'megabytes')
print(bytesize/(1024*1024*1024), 'gegabytes')
return total_size
x=get_size("/content/examples")
I'm sure this helps! For folders and files as well!

- 583
- 7
- 10
Admittedly, this is kind of hackish and only works on Unix/Linux.
It matches du -sb .
because in effect this is a Python bash wrapper that runs the du -sb .
command.
import subprocess
def system_command(cmd):
""""Function executes cmd parameter as a bash command."""
p = subprocess.Popen(cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
shell=True)
stdout, stderr = p.communicate()
return stdout, stderr
size = int(system_command('du -sb . ')[0].split()[0])

- 884
- 17
- 31
I'm a little late (and new) here but I chose to use the subprocess module and the 'du' command line with Linux to retrieve an accurate value for folder size in MB. I had to use if and elif for root folder because otherwise subprocess raises error due to non-zero value returned.
import subprocess
import os
#
# get folder size
#
def get_size(self, path):
if os.path.exists(path) and path != '/':
cmd = str(subprocess.check_output(['sudo', 'du', '-s', path])).\
replace('b\'', '').replace('\'', '').split('\\t')[0]
return float(cmd) / 1000000
elif os.path.exists(path) and path == '/':
cmd = str(subprocess.getoutput(['sudo du -s /'])). \
replace('b\'', '').replace('\'', '').split('\n')
val = cmd[len(cmd) - 1].replace('/', '').replace(' ', '')
return float(val) / 1000000
else: raise ValueError
If you are in Windows OS you can do:
install the module pywin32 by launching:
pip install pywin32
and then coding the following:
import win32com.client as com
def get_folder_size(path):
try:
fso = com.Dispatch("Scripting.FileSystemObject")
folder = fso.GetFolder(path)
size = str(round(folder.Size / 1048576))
print("Size: " + size + " MB")
except Exception as e:
print("Error --> " + str(e))

- 321
- 2
- 5
import os
def get_size(path):
total_size = 0
for dirpath, dirnames, filenames in os.walk(path):
for f in filenames:
if os.path.exists(fp):
fp = os.path.join(dirpath, f)
total_size += os.path.getsize(fp)
return total_size # in megabytes
Thanks monkut & troex!
-
This code won't run (typos in the f/fp variables) and is not recursive. – oferlivny Sep 16 '15 at 15:06
-
1This won't run. You refer to `fp` before it is assigned. It also would return `total_size` in bytes, not megabytes. – freethebees Feb 27 '18 at 10:20