0

I have a Python script that creates two child processes with multiprocessing.Process():

  1. One child process (Child 1) launches a server that serves a machine learning model, and it does not terminate on its own (so it's like an infinite loop).
  2. The other child process (Child 2) is a client, which loads some text and uses the served model (from Child 1) to make inferences on the text.

I would like the script to kill both child processes as well as the server launched inside Child 1 once Child 2 completes all the inferences.

I've tried a bunch of methods that I found here, including using os.kill() in conjunction with signal.SIGTERM from the module signal and using terminate() and kill() methods of the child processes. So far none of the strategies seems to work, evident in the fact that the Python processes still remain in the task manager after inference is done. I have to do CTRL + C multiple times to close those processes.

A sample script is below. Any pointers are much appreciated!

import subprocess, os, pickle, signal
from multiprocessing import Process, Manager
from bert_serving.client import BertClient

tuned_model_dir = directory/of/tuned/model
model_dir = directory/of/original/model    

# Command to be executed by subprocess.Popen()
bert_command = ["bert-serving-start", "-tuned_model_dir", f"{tuned_model_dir}", 
                "-model_dir", f"{model_dir}", "-pooling_strategy", "CLASSIFICATION", 
                "-prefetch_size", "1024", "-ckpt_name", "model.ckpt-4083", 
                "-max_batch_size", "512", "-max_seq_len", "96",  
                "-priority_batch_size", "128", "-fp16"]

def bert_server_worker(bert_command, p_dict):
    p = subprocess.Popen(bert_command, shell=False, 
                         stderr=subprocess.DEVNULL, 
                         stdout=subprocess.DEVNULL)
    p_dict["server"] = p.pid

def bert_client_worker(data_dir):
    # Start the client for inference
    bc = BertClient(check_length=False)

    # Load text data
    sentences_path = os.path.join(data_dir, "sentences.pickle")        
    sentences_list = pickle.load(open(sentences_path, 'rb'))

    # Inference
    scores = bc.encode(sentences_list)
    
    # Pickle the scores
    scores_path = os.path.join(data_dir, "scores.pickle")
    pickle.dump(scores, open(scores_path, 'wb'))

def main(): 
    manager = Manager()
    p_dict = manager.dict()
    
    p1 = Process(target=bert_server_worker, args=(bert_command, p_dict))
    p1.daemon = True   #setting daemon doesn't seem to make a difference
    p1.start()
    
    p2 = Process(target=bert_client_worker, args=(data_dir,))
    p2.daemon=True
    p2.start()
        
    p2.join(timeout=0)
    while p2.is_alive():
        continue
    
    os.kill(p_dict["server"], signal.SIGTERM)
    p1.terminate()        
    p2.terminate()
    
    p1.kill()
    p2.kill()
    
    print("DONE!")

if __name__ = "__main__":
    main()
Alex
  • 4,030
  • 8
  • 40
  • 62
  • You mean that spawned process p1, p2, and including bert-serving-start are all not be killed? – Jacky1205 Jul 31 '20 at 07:23
  • And you code seems incorrect: bert_server_worker() takes no argument, however, you passed p_dict to it when create Process – Jacky1205 Jul 31 '20 at 07:26
  • You could try to use signal.CTRL_C_EVENT and signal.CTRL_BREAK_EVENT instead of SIGINT or SIGTERM – Jacky1205 Jul 31 '20 at 07:35
  • @Jacky1205: Thanks for pointing that out. I corrected the mistake.After I dug around a bit more, it looks like because the command executed in `bert_server_worker` starts a process on the GPU, when Child 2 is completed, the GPU process never gets killed, unless I do CTRL + C manually on the keyboard. I actually already tried CTRL_C_EVENT and CTRL_BREAK_EVENT. Neither worked. – Alex Jul 31 '20 at 13:33

0 Answers0