0

I am trying to run a function concurrently over multiple files in a gui using tkinter and concurrent.futures

Outside of the GUI, this script works fine. However, whenever I translate it over into the GUI script, instead of running the function in parallel, the script opens up 5 new gui tkinter windows (the number of windows it opens is equal to the number of processors I allow the program to use).

Ive looked over the code thoroughly and just cant understand why it is opening new windows as opposed to just running the function over the files.

Can anyone see something I am missing?

An abridged version of the code is below. I have cut out a significant part of the code and only left in the parts pertinent to parralelization. This code undoubtedly has variables in it that I have not defined in this example.

import pandas as pd 
import numpy as np
import glob
from pathlib import Path
from tkinter import *
from tkinter import filedialog
from concurrent.futures import ProcessPoolExecutor

window = Tk()
window.title('Problem with parralelizing')
window.geometry('1000x700')


def calculate():
    #establish where the files are coming from to operate on
    folder_input = folder_entry_var.get()
    #establish the number of processors to use
    nbproc = int(np_var.get())

    #loop over files to get a list of file to be worked on by concurrent.futures
    files = []
    for file in glob.glob(rf'{folder_input}'+'//*'):
        files.append(file)

    #this function gets passed to concurrent.futures. I have taken out a significant portion of 
    #function itself as I do not believe the problem resides in the function itself.
    def process_file(filepath):

        excel_input = excel_entry_var.get()
        minxv = float(min_x_var.get())
        maxxv = float(man_x_var.get())
        output_dir = odir_var.get()


        path = filepath

        event_name = Path(path).stem
        event['event_name'] = event_name




        min_x = 292400
        max_x = 477400

        list_of_objects = list(event.object.unique())

        missing_master_surface = []

        for line in list_of_objects:


            df = event.loc[event.object == line]

            current_y = df.y.max()
            y_cl = df.x.values.tolist()
            full_ys = np.arange(min_x,max_x+200,200).tolist()

            for i in full_ys:
                missing_values = []
                missing_v_y = []
                exist_yn = []
                event_name_list = []

                if i in y_cl:
                    next
                elif i not in y_cl:
                    missing_values.append(i)
                    missing_v_y.append(current_y)
                    exist_yn.append(0)
                    event_name_list.append(event_name)


    # feed the function to processpool executer to run. At this point, I hear the processors
    # spin up, but all it does is open 5 new tkinter windows (the number of windows is proportionate
    #to the number of processors I give it to run
    if __name__ == '__main__': 
        with ProcessPoolExecutor(max_workers=nbproc) as executor:
            executor.map(process_file, files)

window.mainloop()
Josiah Hulsey
  • 499
  • 1
  • 7
  • 26

1 Answers1

0

Ive looked over the code thoroughly and just cant understand why it is opening new windows as opposed to just running the function over the files.

Each process has to reload your code. At the very top of your code you do window = Tk(). That is why you get one window per process.

Bryan Oakley
  • 370,779
  • 53
  • 539
  • 685
  • thanks for pointing that out. Maybe I dont understand Tkinter correctly, but how else would I use the tkinter functionality without calling the Tk class? I dont understand how to get around the problem. Are you saying that I need to run the parraleization part in a separate window? – Josiah Hulsey Dec 12 '19 at 19:17
  • @JosiahHulsey: You need to call it in a way that `Tk()` is only called once in the main process (ie: inside of the `if __name__ == "__main__"` block.) – Bryan Oakley Dec 12 '19 at 22:55