Performance of my python program goes down while using multiprocessing

Question

I'm trying to make the following program run faster, so I used multiprocessing as suggested in the responses to the last question I asked, also on the related question. So, I've got the following program which runs very slowly than before:

import threading 
from threading import Thread
from multiprocessing import Process

def sim_QxD_word(query, document, model, alpha, outOfVocab, lock):
#word_level
    sim_w = {}
    for q in set(query.split()):
        sim_w[q] = {}
        qE = []
        if q in model.vocab:
            qE = model[q]
        elif q in outOfVocab:
            qE = outOfVocab[q]
        else:
            qE = numpy.random.rand(model.layer1_size) # random vector
            lock.acquire()
            outOfVocab[q] = qE
            lock.release()

        for d in set(document.split()):
            dE = []
            if d in model.vocab:
                dE = model[d]
            elif d in outOfVocab:
                dE = outOfVocab[d]
            else:
                dE = numpy.random.rand(model.layer1_size) # random vector
                lock.acquire()
                outOfVocab[d] = dE
                lock.release()
            sim_w[q][d] = sim(qE,dE,alpha)
    return (sim_w, outOfVocab)

def sim_QxD_sequences(query, document, model, outOfVocab, alpha, lock): #sequence_level
    # 1. extract document sequences 
    document_sequences = []
    for i in range(len(document.split())-len(query.split())):
        document_sequences.append(" ".join(document.split()[i:i+len(query.split())]))
    # 2. compute similarities with a query sentence
    lock.acquire()
    query_vec, outOfVocab = avg_sequenceToVec(query, model, outOfVocab, lock)
    lock.release()
    sim_QxD = {}
    for s in document_sequences:
        lock.acquire()
        s_vec, outOfVocab = avg_sequenceToVec(s, model, outOfVocab, lock)
        lock.release()
        sim_QxD[s] = sim(query_vec, s_vec, alpha)
    return (sim_QxD, outOfVocab)

def word_level(q_clean, d_text, model, alpha, outOfVocab, out_w, q, ext_id, lock):
    print("in word_level")
    sim_w, outOfVocab = sim_QxD_word(q_clean, d_text, model, alpha, outOfVocab, lock)
    numpy.save(join(out_w, str(q)+ext_id+"word_interactions.npy"), sim_w)

def sequence_level(q_clean, d_text, model, outOfVocab, alpha, out_s, q, ext_id, lock):
    print("in sequence_level")
    sim_s, outOfVocab = sim_QxD_sequences(q_clean, d_text, model, outOfVocab, alpha, lock)
    numpy.save(join(out_s, str(q)+ext_id+"sequence_interactions.npy"), sim_s)

def extract_AllFeatures_parall(q_clean, d_text, model, alpha, outOfVocab, out_w, q, ext_id, out_s, lock):
    print("in extract_AllFeatures")
    thW=Process(target = word_level, args=(q_clean, d_text, model, alpha, outOfVocab, out_w, q, ext_id, lock,))
    thS=Process(target = sequence_level, args=(q_clean, d_text, model, outOfVocab, alpha, out_s, q, ext_id, lock,))
    thW.start()
    thS.start()
    thW.join()
    thS.join()

def process_documents(documents, index, model, alpha, outOfVocab, out_w, out_s, queries, stemming, stoplist, q):
    print("in process_documents")
    q_clean = clean(queries[q],stemming, stoplist)
    lock = threading.Lock()
    for d in documents:
        ext_id, d_text = reaDoc(d, index)
        extract_AllFeatures_parall(q_clean, d_text, model, alpha, outOfVocab, out_w, q, ext_id, out_s, lock)

outOfVocab={} # shared variable over all threads queries = {"1":"first query", ...} # can contain 200 elements

....

threadsList = [] 
for q in queries.keys():
    thread = Process(target = process_documents, args=(documents, index, model, alpha, outOfVocab, out_w, out_s, queries, stemming, stoplist, q,))
    thread.start()
    threadsList.append(thread):
for th in threadsList:
    th.join()

But it seems like all the processes are running on the same node or like just one is running. I don't have a great experience on multi-processing with python, maybe the problem is about the use of the shared lock, because after some seconds of running the program I get this screen with "htop" command:

After that, just one node is running and heer is what's printed on the consol:

in process_documents 
in process_documents 
in extract_AllFeatures 
in extract_AllFeatures 
in word_level 
in word_level 
in sequence_level 
in sequence_level

which means that functions are running, but it's very slow.

It's nonsense to load into a single core and wait for performance. You should be familiar with processor overflow rates. How much space does the leap loops occupy? Multiple-processor cores need to open multiple jobs. Also, your processor may be doing slow operations that you specify. It is wrong to assess your performance without having knowledge of reflexes and abilities. — dsgdfg, Dec 14 '17 at 11:03
That's why I'm posting this question, because I don't know how to resolve this, and I thought that I could found the help on this forum, I don't really know How I'll solve this. — Belkacem Thiziri, Dec 14 '17 at 12:36
`Learn your computer >> learn Linux >> learn python >> test >> analyze >> write a proram` I don't know which one you can't do, the computer is your computer (I can't comment). However, set the program to fill 80% of a single core, then shred the whole job by that value and run it as a sub-process. I'd like to help you more, but I don't want to mislead you. — dsgdfg, Dec 15 '17 at 07:32
Thanks anyway, I'll try something else to know what the problem is, I hope I could found the solution, if so, I'll add a comment to answer my question. — Belkacem Thiziri, Dec 15 '17 at 08:12
I commented the instruction "lock.acquire()" and "lock.release()", now the program is working, that means that I misused the 'lock". However without a lock I will not get all the "outOfVocab" values. Has someone another suggestion? — Belkacem Thiziri, Dec 15 '17 at 09:22
@BelkacemThiziri Try to reduce your code to a [Minimal, Complete, and Verifiable example](https://stackoverflow.com/help/mcve). It will be hard for the SO community to help you just by looking at the code (it's better when they can reproduce the "bug" themselves). Meanwhile, I suggest you take a look at [Gensim](https://radimrehurek.com/gensim/similarities/docsim.html) & [NLTK](http://www.nltk.org/) libraries. — Ghilas BELHADJ, Dec 22 '17 at 15:07

Performance of my python program goes down while using multiprocessing

0 Answers0