169

I've got a script that checks for 0-size, but I thought there must be an easier way to check for file sizes instead. I.e. file.txt is normally 100 kB; how can I make a script check if it is less than 90 kB (including 0), and make it Wget a new copy because the file is corrupt in this case?

What I'm currently using...

if [ -n file.txt ]
then
  echo "everything is good"
else
  mail -s "file.txt size is zero, please fix. " myemail@gmail.com < /dev/null
  # Grab wget as a fallback
  wget -c https://www.server.org/file.txt -P /root/tmp --output-document=/root/tmp/file.txt
  mv -f /root/tmp/file.txt /var/www/file.txt
fi
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
user349418
  • 2,107
  • 4
  • 18
  • 14

12 Answers12

302

[ -n file.txt ] doesn't check its size. It checks that the string file.txt is non-zero length, so it will always succeed.

If you want to say "size is non-zero", you need [ -s file.txt ].

To get a file's size, you can use wc -c to get the size (file length) in bytes:

file=file.txt
minimumsize=90000
actualsize=$(wc -c <"$file")
if [ $actualsize -ge $minimumsize ]; then
    echo size is over $minimumsize bytes
else
    echo size is under $minimumsize bytes
fi

In this case, it sounds like that's what you want.

But FYI, if you want to know how much disk space the file is using, you could use du -k to get the size (disk space used) in kilobytes:

file=file.txt
minimumsize=90
actualsize=$(du -k "$file" | cut -f 1)
if [ $actualsize -ge $minimumsize ]; then
    echo size is over $minimumsize kilobytes
else
    echo size is under $minimumsize kilobytes
fi

If you need more control over the output format, you can also look at stat. On Linux, you'd start with something like stat -c '%s' file.txt, and on BSD and Mac OS X, something like stat -f '%z' file.txt.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Mikel
  • 24,855
  • 8
  • 65
  • 66
  • 6
    Why `du -b "$file" | cut -f 1` instead of `stat -c '%s' "$file"`? Or `stat --printf="%s" "$file"`? – mivk Dec 14 '13 at 11:00
  • 2
    Only because it's more portable. BSD and Linux `stat` have different flags. – Mikel Dec 16 '13 at 16:40
  • One advantage to the 'du' solution is that it can (with the -s option) also be used on directories. – Trebor Rude Feb 19 '14 at 16:12
  • Try `wc -c`. Or `BLOCKSIZE=1 du`. Or if you don't mind using kilobytes, `du -k`. – Mikel Apr 13 '14 at 06:20
  • 1
    I had to modify it to `... | cut -d' ' -f1` to get it to work on Ubuntu. – Mikepote May 06 '14 at 13:04
  • @Mikepote Yes, now I'm using `wc`, the `-d ' '` is needed, thanks. – Mikel May 06 '14 at 14:31
  • 10
    Use `wc -c < "$file"` (note the `<`), in which case you don't need the `| cut ...` part (which, as posted, doesn't work on OSX). The minimum `BLOCKSIZE` value for `du` on OSX is `512`. – mklement0 May 14 '14 at 22:00
  • 1
    On Mac OS X (Mavericks), I found that cut handles each individual space as a field separator. I favoured awk instead: `... | awk '{print $1}'` – paddy May 27 '15 at 05:35
  • @paddy With `wc`, there should only be one space. With `du`, there should be a tab. But I'll use mklement0's suggestion for `wc` to remove the `cut` entirely. – Mikel May 27 '15 at 14:27
  • @Mikel The problem I had was that there are numerous leading spaces. It looks like the number is output with 8-character width, as in `printf( "%8d", size );`. Those leading spaces are all picked up as field separators by `cut`. Your updated solution using standard input without `cut` appears to be more portable. – paddy May 28 '15 at 02:29
  • @Mikel how to do this process each 20 seconds so i can check incrementation of file size so on? – A Sahra Mar 27 '17 at 11:16
  • 3
    Is it not inefficient to read the file to determine it's size? I think stat will not read the file to see it's size. – Petri Sirkkala Apr 25 '17 at 11:08
  • 2
    It also seems that `wc -c` uses optimized approach [fstat and seek](https://unix.stackexchange.com/questions/16640/how-can-i-get-the-size-of-a-file-in-a-bash-script#comment634607_16668) to determine the file size, but this obviously only works when given a file to work witt, not a stream. So to avoid the cut operation on wc output the whole file is read through to know it's size. – Petri Sirkkala Apr 25 '17 at 11:19
  • 3
    @PetriSirkkala On my Linux system, `wc -c – Mikel Apr 26 '17 at 00:05
  • While the implementation for `wc -c` indeed does use `seek`, it still isn't the purpose of this tool - as it reads the rest of the bytes on stream. The `stat` solution by Daniel C. Sobral is the correct way to do. `du` also has a `--bytes` option that is equally useful. – Yuval Jul 26 '23 at 07:56
  • As I said in my comment above from 2013, I suggested this approach because the way to run `stat` is different on different operating systems. – Mikel Jul 27 '23 at 10:15
45

stat can also check the file size. Some methods are definitely better: using -s to find out whether the file is empty or not is easier than anything else if that's all you want. And if you want to find files of a size, then find is certainly the way to go.

I also like du a lot to get file size in kb, but, for bytes, I'd use stat:

size=$(stat -f%z $filename) # BSD stat

size=$(stat -c%s $filename) # GNU stat?
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Daniel C. Sobral
  • 295,120
  • 86
  • 501
  • 681
  • 2
    `stat` is a great idea, but on CentOS this is what worked for me: `size=$(stat -c%s $filename)` – Oz Solomon Jun 13 '14 at 21:44
  • 2
    The difference between GNU and BSD is what, unfortunately, makes this alternative a bit less attractive. :( – lapo Oct 29 '17 at 19:35
  • 1
    stat can be misleading if the file is sparse. You could use the blocks reported by stat to calculate space used. – Ajith Antony Feb 13 '20 at 20:06
  • @AjithAntony That's an interesting point which did not occur to me. I can see `stat` being the _right_ thing in some situations, and sparse files are not relevant in most situations, though certainly not all. – Daniel C. Sobral Feb 17 '20 at 18:33
  • du -b will do the bytes. – januarvs Apr 21 '21 at 00:34
20

An alternative solution with AWK and double parenthesis:

FILENAME=file.txt
SIZE=$(du -sb $FILENAME | awk '{ print $1 }')

if ((SIZE<90000)) ; then
    echo "less";
else
    echo "not less";
fi
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
fstab
  • 4,801
  • 8
  • 34
  • 66
  • 1
    Nice, but won't work on OSX, where `du` doesn't support `-b`. (It may be a conscious style choice, but just to mention the alternative: you can omit the `$` prefix inside `(( ... ))` when referencing variables: `((SIZE<90000))`) – mklement0 May 14 '14 at 22:23
  • 1
    Actually it was an edit from a previous user who thought it was wrong to omit the `$` – fstab May 15 '14 at 08:37
  • 2
    @fstab, you may ommit `awk` by using `read` (`bash` internal command): `read SIZE _ <<<$(du -sb "$FILENAME")` – Jdamian Nov 13 '14 at 09:01
16

If your find handles this syntax, you can use it:

find -maxdepth 1 -name "file.txt" -size -90k

This will output file.txt to stdout if and only if the size of file.txt is less than 90k. To execute a script script if file.txt has a size less than 90k:

find -maxdepth 1 -name "file.txt" -size -90k -exec script \;
gniourf_gniourf
  • 44,650
  • 9
  • 93
  • 104
  • 4
    +1, but to also make it work on OSX, you need an explicit target directory argument, e.g.: `find . -maxdepth 1 -name "file.txt" -size -90k` – mklement0 May 14 '14 at 22:17
10

If you are looking for just the size of a file:

cat $file | wc -c

Sample output:

203233

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
BananaNeil
  • 10,322
  • 7
  • 46
  • 66
  • 2
    This might be the shortest workable answer, but it is probably also the slowest. :) – SunSparc Aug 25 '14 at 22:41
  • 4
    Yes, but certainly economically superior: Cost of engineering time > Cost of computation time – BananaNeil Sep 07 '14 at 20:48
  • 11
    `wc -c "$file"` was given as an answer in 2011 (three years ago). Yes, `wc -c "$file"` has the problem that it outputs the file name as well as the character count, so the early answers added a command to separate out the count. But `wc -c < "$file"`, which fixes that problem, was added as a comment in May 2014. Your answer is equivalent to that, except it adds a [“useless use of `cat`”](http://superuser.com/q/323060/354511). Also, you should quote all shell variable references unless you have a good reason not to. – G-Man Says 'Reinstate Monica' Nov 12 '14 at 21:22
  • 2
    You can make this more efficient by using head -c instead of cat.if [ $(head -c 90000 $file | wc -c) -lt 90000 ] ; then echo "File is smaller than 90k" ; fi . Tested on CentOS, so it may or may not work on BSD or OSX. – Kevin Keane Feb 06 '17 at 10:59
6

This works in both Linux and macOS:

function filesize
{
    local file=$1
    size=`stat -c%s $file 2>/dev/null` # Linux
    if [ $? -eq 0 ]
    then
        echo $size
        return 0
    fi

    eval $(stat -s $file) # macOS
    if [ $? -eq 0 ]
    then
        echo $st_size
        return 0
    fi

    return -1
}
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Goblinhack
  • 2,859
  • 1
  • 26
  • 26
  • 1
    [The default shell in macOS is now](https://en.wikipedia.org/wiki/MacOS_Catalina#Removed_or_changed_components) [Z shell](https://en.wikipedia.org/wiki/Z_shell) (from 2019). – Peter Mortensen Jan 20 '22 at 18:30
5

Use:

python -c 'import os; print (os.path.getsize("... filename ..."))'

It is portable, for all flavours of Python, and it avoids variation in stat dialects.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
user6336835
  • 77
  • 1
  • 2
4

For getting the file size in both Linux and Mac OS X (and presumably other BSD systems), there are not many options, and most of the ones suggested here will only work on one system.

Given f=/path/to/your/file,

what does work in both Linux and Mac's Bash:

size=$( perl -e 'print -s shift' "$f" )

or

size=$( wc -c "$f" | awk '{print $1}' )

The other answers work fine in Linux, but not in Mac:

  • du doesn't have a -b option in Mac, and the BLOCKSIZE=1 trick doesn't work ("minimum blocksize is 512", which leads to a wrong result)

  • cut -d' ' -f1 doesn't work because on Mac, the number may be right-aligned, padded with spaces in front.

So if you need something flexible, it's either perl's -s operator , or wc -c piped to awk '{print $1}' (awk will ignore the leading white space).

And of course, regarding the rest of your original question, use the -lt (or -gt) operator:

if [ $size -lt $your_wanted_size ]; then, etc.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
mivk
  • 13,452
  • 5
  • 76
  • 69
  • 3
    +1; if you know you'll only be using the size in an _arithmetic_ context (where leading whitespace is ignored), you can simplify to `size=$(wc -c < "$f")` (note the `<`, which causes `wc` to only report a number). Re comparison: don't forget the more "bash-ful" `if (( size < your_wanted_size )); then ...` (and also `[[ $size -lt $your_wanted_size ]]`). – mklement0 May 14 '14 at 22:13
3

Based on gniourf_gniourf’s answer,

find "file.txt" -size -90k

will write file.txt to stdout if and only if the size of file.txt is less than 90K, and

find "file.txt" -size -90k -exec command \;

will execute the command command if file.txt has a size less than 90K.  I have tested this on Linux.  From find(1),

…  Command-line arguments following (the -H, -L and -P options) are taken to be names of files or directories to be examined, up to the first argument that begins with ‘-’, …

(emphasis added).

2
ls -l $file | awk '{print $6}'

assuming that ls command reports filesize at column #6

yeugeniuss
  • 180
  • 1
  • 7
2

I would use du's --threshold for this. Not sure if this option is available in all versions of du but it is implemented in GNU's version.

Quoting from du(1)'s manual:

-t, --threshold=SIZE
       exclude entries smaller than SIZE if positive, or entries greater
       than SIZE if negative

Here's my solution, using du --threshold= for OP's use case:

THRESHOLD=90k
if [[ -z "$(du --threshold=${THRESHOLD} file.txt)" ]]; then
    mail -s "file.txt size is below ${THRESHOLD}, please fix. " myemail@gmail.com < /dev/null
    mv -f /root/tmp/file.txt /var/www/file.txt
fi

The advantage of that, is that du can accept an argument to that option in a known format - either human as in 10K, 10MiB or what ever you feel comfortable with - you don't need to manually convert between formats / units since du handles that.

For reference, here's the explanation on this SIZE argument from the man page:

The SIZE argument is an integer and optional unit (example: 10K is 
10*1024). Units are K,M,G,T,P,E,Z,Y (powers of 1024) or KB,MB,... (powers
of 1000). Binary prefixes can be used, too: KiB=K, MiB=M, and so on.
Doron Behar
  • 2,606
  • 2
  • 21
  • 24
  • +1 Excellent option. Unfortunately some of us are stuck with older versions of `du` that don't support it. The `--threshold` option was added in coreutils 8.21, [released in 2013](https://savannah.gnu.org/forum/forum.php?forum_id=7505). – Amit Naidu Jul 26 '19 at 20:01
2

Okay, if you're on a Mac, do this: stat -f %z "/Users/Example/config.log" That's it!

GarfExiXD
  • 21
  • 1