0

I need to create some large files and show the write progress in python. Currently I am using this code to create the file. But I can not show the progress. Python Write function returns the number actually written at the end of write operation. But I need to know how much byte is written in every second.

oneGB = 1024*1024*1024 # 1GB
with open('large_file', 'wb') as fout:
    bytes_number = fout.write(os.urandom(oneGB))
    print(bytes_number)

I know that I can get the expected result using dd comand with progress in linux, unfortunetly the system I am working doesn't support progress as status flag for dd command. I get this when try to run dd command.

dd: invalid status flag: `progress'

Here is my dd command:

dd if=/dev/zero of=temp_file status=progress count=1M bs=5120
Masudul Hasan
  • 137
  • 2
  • 17
  • I think [this](https://stackoverflow.com/questions/3160699/python-progress-bar) is what you are looking for – Mntfr Mar 17 '19 at 04:19
  • Thanks. This is helpful. But I actually need the exact byte size written every second. Thats the main important information for me. – Masudul Hasan Mar 17 '19 at 04:23
  • Can't you use a constructor? – Mntfr Mar 17 '19 at 04:30
  • @Mntfr, Sorry, I didn't get it. can you please explain? or give me an example? – Masudul Hasan Mar 17 '19 at 04:33
  • Are you trying to replace the `dd` command with your Python script, just because you want to see progress? Or do you have to use Python? Because [this question](https://askubuntu.com/q/215505/643673) covers seeing `dd` progress in Linux itself. – Niayesh Isky Mar 17 '19 at 05:24
  • @NiayeshIsky, I main goal is to see write speed for every second. I know that I can use dd command with progress parameter but It need GNU version >8.24 which is not available in the system I am working. Thats why I looking for other ways. Also I can use either java/python – Masudul Hasan Mar 17 '19 at 05:42
  • Yes, but did you look at `pv` mentioned in [this answer](https://askubuntu.com/a/215590/643673)? In my opinion, your problem is best solved using a *nix solution, not a Python script. – Niayesh Isky Mar 17 '19 at 05:46
  • Actually I need a script beacuse I have repeat this process for a large number of time. and also I can not pv. The system doesn't have pv installed and I don't have the permission to install any module – Masudul Hasan Mar 17 '19 at 05:50

1 Answers1

0

The best tool for this job, of course, is pv. If you don't have the permissions to install it globally, you may still be able to build and install it from the source (without sudo) in your user directory only, because it is installed using autoconf/automake, so you just have to run ./configure with --prefix=$HOME/bin before making & installing.

However, if you really want to write a Python script, there are two parts to consider: the progress bar itself and the data for the progress bar.

For the progress bar itself, @Mntfr has already mentioned a question that covers this.

For the progress bar information, you will need some way of keeping track of how much data has been written so far. So, either you need to write smaller amounts at a time (for example, only write 5MB before updating the progress bar), or you will need to asynchronously keep track of the size of your output file, and update your progress bar accordingly. The first method may be slower overall (since it involves starting and stopping writing), but the second method is probably harder (since it requires asynchronicity in your script, which is not very fun to implement in Python unless you just write two scripts and run them side-by-side). These two ideas should give you a start on how to implement what you're looking for.

Niayesh Isky
  • 1,100
  • 11
  • 17
  • Lets say, I write 5MB at a time, and takes 2s to complete. how can get how much was written on first second, and how much written on second second? – Masudul Hasan Mar 17 '19 at 07:17
  • You can't (as far as I know). The only method that can actually tell in real time what is written at any smaller intervals is by checking the file itself. However, you can time how long each 5MB takes, and update your time estimate based on the average time per 5MB. After all, I'm guessing you're writing this for humans, and humans don't care about very small granularity of time, as long as the progress estimate is mostly accurate. – Niayesh Isky Mar 17 '19 at 07:41