5

I have multiple massive csv files I am processing in parallel. I'd like to have a progress bar for each file.

However, while I am displayed 5 bars, only the last one is being updated - seemingly by all processes at once. As I can't read the whole csv file into memory I am using filesize to display progress.

inputArg is the folder path ending with a number.

def worker(inputArg):
        with open(inputArg + '/data.csv') as csvfile:
                size = os.path.getsize(inputArg + '/data.csv')
                text = "progresser #{}".format(inputArg[-1])
                pb = tqdm(total=size, unit="B", unit_scale=True, desc=text, position=int(inputArg[-1]))
                reader = csv.reader(csvfile, delimiter=',')
                for row in reader:
                        pb.update(len(row))
                        session.execute(*INSERT QUERY*)

    def scheduler(inputData):
            p = multiprocessing.Pool(multiprocessing.cpu_count()+1)
            p.map(worker, inputData)
            p.close()
            p.join()

    if __name__ == '__main__':
            folders = glob.glob('FILEPATH/*')
            print ('--------------------Insert started---------------')
            scheduler(folders)
            print('---------------------All Done---------------------')

Any hint would be appreciated!

EDIT: I did check the other answer, but I explicitly said I want multiple progress bars, and that answer only gives you ONE. Hence, this is not a duplicate.

EDIT2: Here's what it looks like @bouteillebleu, I do get my bars, but only the last one is updated for some reason. Current progress bars

Illuminae
  • 71
  • 1
  • 8
  • Possible duplicate of [How to use tqdm through multi process in python?](https://stackoverflow.com/questions/43064054/how-to-use-tqdm-through-multi-process-in-python) – Sraw Sep 28 '17 at 09:10
  • Is https://stackoverflow.com/questions/45742888/tqdm-using-multiple-bars any help at all? It looks like you can choose the position each bar is displayed at, which would make it possible to see different results for each of the processed CSVs. – bouteillebleu Sep 28 '17 at 09:20
  • @bouteillebleu Thanks for the comment! I added a picture - as I'm using the position parameter already, I do get the different bars. Just the updating seems glitchy? – Illuminae Sep 28 '17 at 09:26

1 Answers1

1

try using the latest version of tqdm (v4.18.0 or later, see https://github.com/tqdm/tqdm/releases)

casper.dcl
  • 13,035
  • 4
  • 31
  • 32
  • 2
    I am using tqdm v4.26, but I still face the issue described in the question (only the last progress bar updating seemingly by all processes). I am surprised this answer has been marked as correct. – elexhobby Oct 29 '18 at 04:33
  • 1
    odd. @elexhobby it may be your environment does not support nested bars - see https://github.com/tqdm/tqdm#faq-and-known-issues – casper.dcl Oct 30 '18 at 12:22