I am new to the field of parallelizing and optimizing data mining modules in Python and I have a question about parallelizing populating a dictionary. I am actually doing an inverted indexing using values scored in a two dimensional matrix m. The code works fine but I would like to apply python reduce
to make it run faster.
Here is my code:
def createInvertedIndex(matrix):
dic={}
for i in range(len(matrix[0])):
if matrix[1][i] in dic.keys():
dic[matrix[1][i]].append(matrix[0][i])
else:
dic[matrix[1][i]]=list([matrix[0][i]])
return dic