Currently, typical hard drives can write on the order of 100 MB per second.
Your program is writing 6 GB in 6 minutes, which means the overall throughput is ~ 17 MB/s.
So your program is not pushing data to disk anywhere near the maximum rate (assuming you have a typical hard drive).
So your problem might actually be CPU-bound.
If this "back-of-the-envelope" calculation is correct, and if you have a machine with multiple processors, using multiple processes could help you generate more data quicker, which could then be sent to a single process which writes the data to disk.
Note that if you are using CPython, the most common implementation of Python, then the GIL (global interpreter lock) prevents multiple threads from running at the same time. So to do concurrent calculations, you need to use multiple processes rather than multiple threads. The multiprocessing or concurrent.futures module can help you here.
Note that if your hard drive can write 100 MB/s, it would still take ~ 160 minutes to write a 1TB to disk, and if your multiple processes generated data at a rate greater than 100 MB/s, then the extra processes would not lead to any speed gain.
Of course, your hardware may be much faster or much slower than this, so it pays to know your hardware specs.
You can estimate how fast you can write to disk using Python by doing a simple experiment:
with open('/tmp/test', 'wb') as f:
x = 'A'*10**8
f.write(x)
% time python script.py
real 0m0.048s
user 0m0.020s
sys 0m0.020s
% ls -l /tmp/test
-rw-rw-r-- 1 unutbu unutbu 100000000 2014-09-12 17:13 /tmp/test
This shows 100 MB were written in 0.511s. So the effective throughput was ~195 MB/s.
Note that if you instead call f.write
in a loop:
with open('/tmp/test', 'wb') as f:
for i in range(10**7):
f.write('A')
then the effective throughput drops dramatically to just ~ 3MB/s. So how you structure your program -- even if using just a single process -- can make a big difference. This is an example of how collecting your data into fewer but bigger writes can improve performance.
As Max Noel and kipodi have already pointed out, you can also try writing to /dev/null:
with open(os.devnull, 'wb') as f:
and timing a shortened version of your current script. This will show you how much time is being consumed (mainly) by CPU computation. It's this portion of the overall run time that may be improved by using concurrent processes. If it is large then there is hope that multiprocessing may improve performance.