0

I have a Python program in which I want to upload a large .ndjson. Is there a way to add a progress bar so that I know how much of the file is uploaded?

My code:

import json
import pandas as pd
     
df = map(json.loads, open('dump.ndjson'))
df = pd.DataFrame.from_records(records)

The above code uploads the file. The code is good because, when I split the file into 100 pieces I can upload one by one. I want to know if there is a way to add a progress bar so that I can upload the file at once and to see the progress of uploading.

PS: I'm not thinking about GUI, I have tried tqdm progress bar. I was thinking something like that, so that I can see progress in my console.

Is there any way to achieve what I want?

Dinux
  • 644
  • 1
  • 6
  • 19
taga
  • 3,537
  • 13
  • 53
  • 119
  • What kind of progress bar ? CLI ? GUI ? – Orsiris de Jong Aug 21 '20 at 09:26
  • not gui, i head for tqdm progress bar. I was thinking something like that – taga Aug 21 '20 at 09:35
  • `import tqdm` followed by `df = map(json.loads, tqdm.tqdm(open('dump.ndjson')))`, should work. – Ashwin Geet D'Sa Aug 21 '20 at 13:43
  • Sorry, it may not work. But you can give a try. – Ashwin Geet D'Sa Aug 21 '20 at 13:45
  • @AshwinGeetD'Sa it works, but it shows me time, like this : (300748it [00:20, 7613.19it/s] ... 301695it [00:20, 7671.48it/s] ... 302592it [00:20, 7848.06it/s]). Is it possible to shows % ? – taga Aug 23 '20 at 15:09
  • It's possible, but it gets a little tricky, as tqdm would need to know the size of the file in advance to do that. Is it possible to know the file size or the number of files in advance in your case? – Ashwin Geet D'Sa Aug 23 '20 at 19:54
  • Yeah, its 80GB, that the size, and I do not know number of files – taga Aug 23 '20 at 21:51
  • Aren't you using `read()`? seems like you are passing the file object instead of file contents to `json.loads()` – Ashwin Geet D'Sa Aug 25 '20 at 09:44
  • 1
    `df = map(json.loads, tqdm.tqdm(open('dump.ndjson'),units='MB'))`, this shows the amount of contents read in MB. But, based on the question posted, it should show 0MB because, you are not reading the file, but just opening it. For percentage you should know the size of file in advance. – Ashwin Geet D'Sa Aug 25 '20 at 09:48
  • Please let me know if it can go as partial answer. – Ashwin Geet D'Sa Aug 25 '20 at 10:00
  • it gives me error: tqdm.std.TqdmKeyError: "Unknown argument(s): {'units': 'MB'}", and size of my file is 75gb – taga Aug 31 '20 at 09:55
  • I have post my answer here [https://stackoverflow.com/a/73463922/17915481](https://stackoverflow.com/a/73463922/17915481) – jak bin Aug 23 '22 at 19:06

1 Answers1

0

If the question is about a GUI, there's PySimpleGUI that allows easy using of progress bars that rely on Tk, Wx or Qt frameworks. This works on Linux / Windows.

Example from their cookbook:

import PySimpleGUI as sg

# layout the window
layout = [[sg.Text('Uploading...')],
          [sg.ProgressBar(100, orientation='h', size=(20, 20), key='progressbar')],
          [sg.Cancel()]]

# create the window`
window = sg.Window('My Program Upload', layout)
progress_bar = window['progressbar']
# loop that would normally do something useful
for i in range(1000):
    # check to see if the cancel button was clicked and exit loop if clicked
    event, values = window.read(timeout=10)
    if event == 'Cancel'  or event == sg.WIN_CLOSED:
        break
  # update bar with loop value +1 so that bar eventually reaches the maximum
    progress_bar.UpdateBar(i + 1)
  # TODO: Insert your upload code here
# done with loop... need to destroy the window as it's still open
window.close()

I use that kind of progress bar and keep my upload function in a separate thread so the UI is non blocking.

Orsiris de Jong
  • 2,819
  • 1
  • 26
  • 48
  • Im not thinking on gui, I have head for tqdm progress bar. I was thinking something like that, so that I can see progress in my console – taga Aug 21 '20 at 09:41