1

In one part of my code I need to compare contents of a dictionary with each other in a pairwise way to compute the intersection between their values. For example if i have 3 keys, then i should compare (1,2), (1,3), (2,1), (2,3), (3,1),(3,2). The code below shows the way I have implemented intersection of lists.

myDict = {}
myDict[1] = [1,2,3,4,5]
myDict[2] = [2,3,4]
myDict[3] = [1,2,5]
myDict[4] = [4,5,6,7]

finalDict = {}

for i in myDict.keys():
    similar_words = []
    
    for j in myDict.keys():

        if i ==j:
            continue
        else:
            temp = list(set(myDict[i]) & set(myDict[j]))
            similar_words.append([j, temp])

    finalDict[i] = similar_words

print(finalDict)

And the result is:

{1: [[2, [2, 3, 4]], [3, [1, 2, 5]], [4, [4, 5]]], 2: [[1, [2, 3, 4]], [3, [2]], [4, [4]]], 3: [[1, [1, 2, 5]], [2, [2]], [4, [5]]], 4: [[1, [4, 5]], [2, [4]], [3, [5]]]}

Comparing them and computing their intersection really takes a lot of time. I want to implement it using python multi threading and perform all the comparisons in parallel. I am new to multi-threading and i dont know what should i do. I really appreciate answers which solves my problem.

Orca
  • 475
  • 1
  • 12
  • This is not an answer to your question, but would it not be faster (time-complexity wise at least) to sort the data and use standard two-pointer comparison algorithms? https://helloacm.com/how-to-compute-the-intersection-of-two-arrays-using-sorting-two-pointer-algorithm/ – QWERTYL Nov 25 '22 at 19:58
  • Parallelization with Python doesn't help you. See here for details: https://stackoverflow.com/questions/1294382/what-is-the-global-interpreter-lock-gil-in-cpython – Homer512 Nov 25 '22 at 20:14
  • @Homer512 Thanks for you reply. So is there any efficient way that i can run such code faster. I have heard about multiprocessing, multi-threading but i dont know how they can help my problem – Orca Nov 25 '22 at 20:18

0 Answers0