104

When I use ls or du, I get the amount of disk space each file is occupying.

I need the sum total of all the data in files and subdirectories I would get if I opened each file and counted the bytes. Bonus points if I can get this without opening each file and counting.

Jonas
  • 121,568
  • 97
  • 310
  • 388
Arthur Ulfeldt
  • 90,827
  • 27
  • 201
  • 284
  • 1
    `ls` actually shows the number of bytes in each file, not the amount of disk space. Is this sufficient for your needs? – Greg Hewgill Aug 06 '09 at 22:21
  • 3
    Note that `du` can't be used to answer to this question. It shows the amount of disk space the directory occupy on the disk (the files' data plus the size of auxiliary file system meta-information). The `du` output can be even smaller than the total size of all files. This may happen if file system can store data compressed on the disk or if hard links are used. Correct answers are based on `ls` and `find`. See answers by **Nelson** and by **bytepan** here, or this answer: https://unix.stackexchange.com/a/471061/152606 – anton_rh Sep 24 '18 at 13:35

12 Answers12

108

If you want the 'apparent size' (that is the number of bytes in each file), not size taken up by files on the disk, use the -b or --bytes option (if you got a Linux system with GNU coreutils):

% du -sbh <directory>
Arkady
  • 14,305
  • 8
  • 42
  • 46
  • 1
    works on my newer red hat boxes, unfortunately not on my embedded Dev box. – Arthur Ulfeldt Aug 06 '09 at 22:34
  • 3
    Is there an easy way to show the “apparent size” in human-readable format? When using `du -shb` (as suggested by this answer), the `-b` setting seems to override the `-h` setting. – Mathias Bynens Aug 01 '12 at 10:59
  • 6
    @MathiasBynens Reverse the order of the flags (i.e. du -sbh ). Works for me. – Luis E. May 30 '13 at 07:52
  • 2
    @MathiasBynens `du -sh --apparent-size /dir/` – Jongosi Mar 15 '15 at 14:41
  • 2
    @Arkady I have tried your solution on CentOS and Ubuntu, and there is a small error. You want "du -sbh". The "-h" flag must come last. – john_science Oct 16 '15 at 22:49
  • 1
    Thanks, @theJollySin! Fixed it now. – Arkady Oct 18 '15 at 01:10
  • 1
    This does not work on my Arch Linux at least. It still prints the space taken up by files on the disk not the total size of the content. Create a directory and copy a big enough file to the directory and then create a hard link to the file. `du` will output the value close to the size of a single file not to the size of both files in sum. It's because the both files use the same data on the disk. – anton_rh Sep 24 '18 at 11:51
46

Use du -sb:

du -sb DIR

Optionally, add the h option for more user-friendly output:

du -sbh DIR
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
rob
  • 6,147
  • 2
  • 37
  • 56
  • 5
    -b seems to be an illegal option for MacOS' du – lynxoid Jan 23 '14 at 18:33
  • 3
    @lynxoid: You can install the GNU version with brew: `brew install coreutils`. It will be available as the command `gdu`. – neu242 Apr 15 '15 at 12:01
  • 1
    Does not work. `ls` -> `file.gz hardlink-to-file.gz`. `stat -c %s file.gz` -> `9657212`. `stat -c %s hardlink-to-file.gz` -> `9657212`. `du -sb` -> `9661308`. It's definitely not the total size of the content but the size the directory takes up on the disk. – anton_rh Sep 24 '18 at 11:59
25

cd to directory, then:

du -sh

ftw!

Originally wrote about it here: https://ataiva.com/get-the-total-size-of-all-the-files-in-a-directory/

AO_
  • 2,573
  • 3
  • 30
  • 31
18

Just an alternative:

ls -lAR | grep -v '^d' | awk '{total += $5} END {print "Total:", total}'

grep -v '^d' will exclude the directories.

anton_rh
  • 8,226
  • 7
  • 45
  • 73
Barun
  • 2,542
  • 33
  • 35
  • 4
    Perfect, also add the -a param to get "hidden files" (anything starting with a period) – Nicholi Apr 20 '11 at 20:02
  • Isolated to a specific file type (in this case, PNG) and expressed in MB for more readability: `ls -lR | grep '.png$' | awk '{total += $5} END {print "Total:", total/1024/1024, "MB"}'` – MusikPolice Sep 09 '14 at 15:11
  • It is a correct answer. Unlike `du` this solution really counts the total size of all the data in files as if they were opened one by one and their bytes were counted. But yes, adding the `-A` parameter is required to count hidden files as well. – anton_rh Sep 24 '18 at 13:08
13

stat's "%s" format gives you the actual number of bytes in a file.

 find . -type f |
 xargs stat --format=%s |
 awk '{s+=$1} END {print s}'

Feel free to substitute your favourite method for summing numbers.

Community
  • 1
  • 1
Nelson
  • 27,541
  • 5
  • 35
  • 31
  • 4
    Preferably use "find . -type f -print0 | xargs -0 ..." to avoid problems with certain file names (containing spaces etc). – hlovdal Aug 06 '09 at 22:23
  • 1
    yeah, good point. if it wasn't in bsd 4.2 I don't remember to use it :-( – Nelson Aug 06 '09 at 22:24
  • 3
    `find -print0` and `xargs -0` are needed for filenames with spaces. OS X wants `stat -f %z`. – Kornel Dec 13 '11 at 01:19
  • 1
    (Note that stat works with sparse files, reporting the large nominal size of the file and not the smaller blocks used on disk like `du` reports.) – Nelson Jun 13 '16 at 15:37
  • I like this one...with slight modifications allows me to get the sizes of, say, all the .o files in megabytes. – mheyman Oct 04 '17 at 16:17
  • 1
    Unlike many other answers here which erroneously use the `du` utility, this answer is correct. It is very similar to answer here: https://unix.stackexchange.com/a/471061/152606. But I would use `! -type d` instead of `-type f` to count symlinks as well (the size of symlink itself (usually few bytes), not the size of the file it points to). – anton_rh Sep 24 '18 at 13:17
3

If you use busybox's "du" in emebedded system, you can not get a exact bytes with du, only Kbytes you can get.

BusyBox v1.4.1 (2007-11-30 20:37:49 EST) multi-call binary

Usage: du [-aHLdclsxhmk] [FILE]...

Summarize disk space used for each FILE and/or directory.
Disk space is printed in units of 1024 bytes.

Options:
        -a      Show sizes of files in addition to directories
        -H      Follow symbolic links that are FILE command line args
        -L      Follow all symbolic links encountered
        -d N    Limit output to directories (and files with -a) of depth < N
        -c      Output a grand total
        -l      Count sizes many times if hard linked
        -s      Display only a total for each argument
        -x      Skip directories on different filesystems
        -h      Print sizes in human readable format (e.g., 1K 243M 2G )
        -m      Print sizes in megabytes
        -k      Print sizes in kilobytes(default)
Sam Liao
  • 43,637
  • 15
  • 53
  • 61
3

For Win32 DOS, you can:

c:> dir /s c:\directory\you\want

and the penultimate line will tell you how many bytes the files take up.

I know this reads all files and directories, but works faster in some situations.

Sun
  • 2,595
  • 1
  • 26
  • 43
3

When a folder is created, many Linux filesystems allocate 4096 bytes to store some metadata about the directory itself. This space is increased by a multiple of 4096 bytes as the directory grows.

du command (with or without -b option) take in count this space, as you can see typing:

mkdir test && du -b test

you will have a result of 4096 bytes for an empty dir. So, if you put 2 files of 10000 bytes inside the dir, the total amount given by du -sb would be 24096 bytes.

If you read carefully the question, this is not what asked. The questioner asked:

the sum total of all the data in files and subdirectories I would get if I opened each file and counted the bytes

that in the example above should be 20000 bytes, not 24096.

So, the correct answer IMHO could be a blend of Nelson answer and hlovdal suggestion to handle filenames containing spaces:

find . -type f -print0 | xargs -0 stat --format=%s | awk '{s+=$1} END {print s}'
bytepan
  • 361
  • 2
  • 5
2

There are at least three ways to get the "sum total of all the data in files and subdirectories" in bytes that work in both Linux/Unix and Git Bash for Windows, listed below in order from fastest to slowest on average. For your reference, they were executed at the root of a fairly deep file system (docroot in a Magento 2 Enterprise installation comprising 71,158 files in 30,027 directories).

1.

$ time find -type f -printf '%s\n' | awk '{ total += $1 }; END { print total" bytes" }'
748660546 bytes

real    0m0.221s
user    0m0.068s
sys     0m0.160s

2.

$ time echo `find -type f -print0 | xargs -0 stat --format=%s | awk '{total+=$1} END {print total}'` bytes
748660546 bytes

real    0m0.256s
user    0m0.164s
sys     0m0.196s

3.

$ time echo `find -type f -exec du -bc {} + | grep -P "\ttotal$" | cut -f1 | awk '{ total += $1 }; END { print total }'` bytes
748660546 bytes

real    0m0.553s
user    0m0.308s
sys     0m0.416s


These two also work, but they rely on commands that don't exist on Git Bash for Windows:

1.

$ time echo `find -type f -printf "%s + " | dc -e0 -f- -ep` bytes
748660546 bytes

real    0m0.233s
user    0m0.116s
sys     0m0.176s

2.

$ time echo `find -type f -printf '%s\n' | paste -sd+ | bc` bytes
748660546 bytes

real    0m0.242s
user    0m0.104s
sys     0m0.152s


If you only want the total for the current directory, then add -maxdepth 1 to find.


Note that some of the suggested solutions don't return accurate results, so I would stick with the solutions above instead.

$ du -sbh
832M    .

$ ls -lR | grep -v '^d' | awk '{total += $5} END {print "Total:", total}'
Total: 583772525

$ find . -type f | xargs stat --format=%s | awk '{s+=$1} END {print s}'
xargs: unmatched single quote; by default quotes are special to xargs unless you use the -0 option
4390471

$ ls -l| grep -v '^d'| awk '{total = total + $5} END {print "Total" , total}'
Total 968133
thdoan
  • 18,421
  • 1
  • 62
  • 57
  • 1
    Regarding Git Bash for Windows, — in case of Cygwin, `dc` is part of `bc` package, so [to get `dc`](/questions/50051184/install-dc-in-cygwin) it is need to install `bc`. – ruvim Apr 26 '18 at 20:28
1

du is handy, but find is useful in case if you want to calculate the size of some files only (for example, using filter by extension). Also note that find themselves can print the size of each file in bytes. To calculate a total size we can connect dc command in the following manner:

find . -type f -printf "%s + " | dc -e0 -f- -ep

Here find generates sequence of commands for dc like 123 + 456 + 11 +. Although, the completed program should be like 0 123 + 456 + 11 + p (remember postfix notation).

So, to get the completed program we need to put 0 on the stack before executing the sequence from stdin, and print the top number after executing (the p command at the end). We achieve it via dc options:

  1. -e0 is just shortcut for -e '0' that puts 0 on the stack,
  2. -f- is for read and execute commands from stdin (that generated by find here),
  3. -ep is for print the result (-e 'p').

To print the size in MiB like 284.06 MiB we can use -e '2 k 1024 / 1024 / n [ MiB] p' in point 3 instead (most spaces are optional).

ruvim
  • 7,151
  • 2
  • 27
  • 36
0

Use:

$ du -ckx <DIR> | grep total | awk '{print $1}'

Where <DIR> is the directory you want to inspect.

The '-c' gives you grand total data which is extracted using the 'grep total' portion of the command, and the count in Kbytes is extracted with the awk command.

The only caveat here is if you have a subdirectory containing the text "total" it will get spit out as well.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Rob Jones
  • 4,925
  • 3
  • 32
  • 39
0

This may help:

ls -l| grep -v '^d'| awk '{total = total + $5} END {print "Total" , total}'

The above command will sum total all the files leaving the directories size.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Ataul Haque
  • 93
  • 2
  • 9
  • 2
    Note that this solution is very similar to the [answer](/a/1267205/1300170) by Barun. But this solution doesn't sum files in sub-directories. – ruvim Oct 02 '15 at 12:09
  • 1
    @ruvim, it doesn't sum hidden files as well. To sum hidden files, the `-A` option must be added to `ls`. – anton_rh Sep 24 '18 at 13:44