0

I have a python script and I want to get the size of a folder (as fast as possible) like this:

 os.system("du -k folder |sort -nr| head -n 1 > size")

Although it seems that works, I get this error

sort: write failed: standard output: Broken pipe
sort: write error

How can I fix it?

Ren91
  • 5
  • 5
  • 1
    You say *"it seems that works"*, does that mean that you are getting the size back but also getting the error? – SuperBiasedMan Aug 20 '15 at 11:40
  • 1
    Does the same thing happen when you run `du -k folder |sort -nr| head -n 1 > size` in a terminal? It seems like this shouldn't be a Python issue, so if you know for a fact that it is, you should add that to your question. – nkorth Aug 20 '15 at 11:41
  • Yes, I get the size of the folder and also get the error. If I run it in a terminal I don't get the error. – Ren91 Aug 20 '15 at 11:43
  • 1
    I think [http://superuser.com/questions/554855/how-can-i-fix-a-broken-pipe-error](http://superuser.com/questions/554855/how-can-i-fix-a-broken-pipe-error) can solve your question. `head` shutdown the pipe in advance. – letiantian Aug 20 '15 at 11:43
  • why are you using such a complicated call to get the size of the folder? – Padraic Cunningham Aug 20 '15 at 11:45
  • Also, see [this answer](http://stackoverflow.com/a/1392549/1084416): `sum(os.path.getsize(f) for f in os.listdir('.') if os.path.isfile(f))` – Peter Wood Aug 20 '15 at 11:51
  • `du -s folder > size` will do the same thing, if you want human readable use `sh` – Padraic Cunningham Aug 20 '15 at 11:52
  • @PeterWood the folder size is around 10 GBs and it is increasing each day.. – Ren91 Aug 20 '15 at 11:55
  • Now using `os.system("du -sk folder > size")` – Ren91 Aug 20 '15 at 12:21
  • 1
    Note that `subprocess.call` is recommended over `os.system`; in your case `subprocess.check_output` with `shell=True`. – mdurant Aug 20 '15 at 12:49

3 Answers3

2

Python sets the SIGPIPE signal to be ignored at startup. Therefore, when sort tries to write to the pipe when head has already finished and closed its stdin, EPIPE with the "broken pipe" message is raised. A workaround would be to reset SIGPIPE to its default action (see also here).

# test case
python -c 'import os; os.system("LANG=C date | false")'  # date: stdout: Broken pipe
python -c 'import os, signal; signal.signal(signal.SIGPIPE, signal.SIG_DFL); os.system("LANG=C date | false")'
carlo
  • 21
  • 2
1

I could reproduce by making a du on a large directory. It happens because for a large directory, the sort process will do many writes to its standard input, and head will close its standard input (and exit) as soon as it can see it has one line.

Do you need to fix it? IMHO this is the way the commands head and sort work, and it does not mean that your value will be incorrect. So I would just redirect the standard error or sort to /dev/null to get rid of the message:

 os.system(" du -k folder |sort -nr 2>/dev/null| head -n 1")

But as already said by Padraic Cunningham, I think that this command is really complicated just to find the total size of a directory.

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
1

As @mdurant said in a comment: The subprocess module is preffered over os.system(). You can get the output of the external program directly into your program without an intermediate file, and also there is no extra shell process started between your program and the external program.

Example in an IPython session:

In [5]: subprocess.check_output(['du', '-sk', 'tmp'])
Out[5]: '101160\ttmp\n'

In [6]: subprocess.check_output(['du', '-sk', 'tmp']).split('\t', 1)
Out[6]: ['101160', 'tmp\n']

In [7]: subprocess.check_output(['du', '-sk', 'tmp']).split('\t', 1)[0]
Out[7]: '101160'

In [8]: int(subprocess.check_output(['du', '-sk', 'tmp']).split('\t', 1)[0])
Out[8]: 101160
BlackJack
  • 4,476
  • 1
  • 20
  • 25