0

I'm currently doing some work with multithreading and i'm trying to figure out why my program isn't working as intended.

def input_watcher():
  while True:
    input_file = os.path.abspath(raw_input('Input file name: '))
    compiler = raw_input('Choose compiler: ')

    if os.path.isfile(input_file):

        obj = FileObject(input_file, compiler)

        with file_lock:
            files.append(obj)

        print 'Adding %s with %s as compiler' % (obj.file_name, obj.compiler)
    else:
        print 'File does not exists'

This is running in one thread and it works fine until i start adding adding the second fileobject.

This is the output from the console:

Input file name: C:\Users\Victor\Dropbox\Private\multiFile\main.py
Choose compiler: aImport
Adding main.py with aImport as compiler
Input file name: main.py updated
C:\Users\Victor\Dropbox\Private\multiFile\main.py
Choose compiler: Input file name: Input file name: Input file name: Input file name:

The input filename keeps popping up the second i added the second filename and it ask for a compiler. The program keeps printing input file name until it crashes.'

I have other code running in a different thread, i don't think it has anything to do with the error, but tell me if you think you need to see it and i will post it.

the full code:

import multiprocessing
import threading
import os
import time


file_lock = threading.Lock()
update_interval = 0.1


class FileMethods(object):
    def a_import(self):
        self.mod_check()




class FileObject(FileMethods):
    def __init__(self, full_name, compiler):

        self.full_name = os.path.abspath(full_name)
        self.file_name = os.path.basename(self.full_name)
        self.path_name = os.path.dirname(self.full_name)

        name, exstention = os.path.splitext(full_name)
        self.concat_name = name + '-concat' + exstention

        self.compiler = compiler
        self.compiler_methods = {'aImport': self.a_import}

        self.last_updated = os.path.getatime(self.full_name)

        self.subfiles = []
        self.last_subfiles_mod = {}

    def exists(self):
        return os.path.isfile(self.full_name)

    def mod_check(self):
        if self.last_updated < os.path.getmtime(self.full_name):
            self.last_updated = os.path.getmtime(self.full_name)
            print '%s updated' % self.file_name
            return True
        else:
            return False

    def sub_mod_check(self):
        for s in self.subfiles:
            if self.last_subfiles_mod.get(s) < os.path.getmtime(s):
                self.last_subfiles_mod[s] = os.path.getmtime(s)
                return True

        return False


files = []


def input_watcher():
    while True:
        input_file = os.path.abspath(raw_input('Input file name: '))
        compiler = raw_input('Choose compiler: ')

        if os.path.isfile(input_file):

            obj = FileObject(input_file, compiler)

            with file_lock:
                files.append(obj)

            print 'Adding %s with %s as compiler' % (obj.file_name, obj.compiler)
        else:
            print 'File does not exists'


def file_manipulation():
    if __name__ == '__main__':
        for f in files:
            p = multiprocessing.Process(target=f.compiler_methods.get(f.compiler)())
            p.start()
            #f.compiler_methods.get(f.compiler)()

def file_watcher():
    while True:
        with file_lock:
            file_manipulation()
        time.sleep(update_interval)


iw = threading.Thread(target=input_watcher)
fw = threading.Thread(target=file_watcher)

iw.start()
fw.start()
dano
  • 91,354
  • 19
  • 222
  • 219
VictorVH
  • 327
  • 1
  • 4
  • 14
  • 1
    The issue doesn't reproduce in a single-threaded environment for me. Does it happen for you with just one thread? I think we'll need to see how input_watcher is getting called, and how that relates to the other thread you have running. – dano Jul 26 '14 at 18:34

1 Answers1

1

This is happening because you're not using an if __name__ == "__main__": guard, while also using multiprocessing.Process on Windows. Windows needs to re-import your module in the child processes it spawns, which means it will keep creating new threads to handle inputs and watch files. This, of course, is a recipe for disaster. Do this to fix the issue:

if __name__ == "__main__":
    iw = threading.Thread(target=input_watcher)
    fw = threading.Thread(target=file_watcher)

    iw.start()
    fw.start()

See the "Safe importing of the main module" section in the multiprocessing docs for more info.

I also have a feeling file_watcher isn't really doing what you want it to (it will keep re-spawning processes for files you've already processed), but that's not really related to the original question.

dano
  • 91,354
  • 19
  • 222
  • 219
  • I just tried this and it worked. I just have a couple a questions. – VictorVH Jul 26 '14 at 18:55
  • Inside the filemanipulation where i spawn the processes i already do the if __name == "__main__" check, why do i have to do it with the threads? – VictorVH Jul 26 '14 at 18:56
  • Also, the point of file_watcher is to spawn new processes that each checks if the file has been updated, and if it has, then do something. Is this the wrong way of doing it? – VictorVH Jul 26 '14 at 18:58
  • 1
    @VictorVH Because, when you spawn a new child process on Windows, it has to do this internally: `import your_file`. That will execute everything at the top-level of the module, including the code that creates/starts your two threads. Putting them in the `if __name__ == "__main__":` guard prevents that from happening, because the stuff inside that guard is no longer at the top-level of the module. – dano Jul 26 '14 at 19:00
  • Does this mean that i don't have to do the name == main check inside the filemanipulation function? – VictorVH Jul 26 '14 at 19:02
  • So what are your thoughts on the file_watcher? I keep running the same function with an interval that spawns new processes that each makes a check for updates, and then (haven't programmed this yet) it will do something with the file. – VictorVH Jul 26 '14 at 19:04
  • 1
    @VictorVH re: `file_watcher`, I guess that will work ok, but there are [libraries out there](http://stackoverflow.com/q/182197/2073595) that will alert you when files change. Those would likely provide a cleaner solution. – dano Jul 26 '14 at 19:05
  • Thank you @dano I know of such libraries, but this is as much practice with multithreading and multiprocessing as it is making the program work. Thank you so much for your help, best help ever recieved on stackoverflow :) – VictorVH Jul 26 '14 at 19:07