I'm creating a lookup dictionary with the help of other dictionaries.
I have 32 cores and 64GB RAM instance where I'm running it.
names
is a huge numpy array with around 28184057
elements. When I try to apply pool.map
on 'names', I get the following error:
struct.error: 'i' format requires -2147483648 <= number <= 2147483647
This is my code:
pool = mp.Pool(32)
uni_lookup = pool.map(self.create_lookup_dict, names)
pool.close()
And create_lookup_dict
contains the following code:
def create_lookup_dict(self, word):
lookup_word_level = {word: list(set(i[0] for i in self.groups.items() \
if word in i[1]))}
final = []
if self.tag_dict.get(word, 0) is 0:
print("No tag")
return (word, lookup_word_level[word])
if self.secondary_category_tag_dict.get(word, 0) is 0:
print("No secondary category tag")
return (word, lookup_word_level[word])
for i in lookup_word_level[word]:
if self.tag_dict[i].intersection(self.tag_dict[word]) and \
self.secondary_category_tag_dict[i].intersection(self.secondary_category_tag_dict[word]):
final.append(i)
return (word, final)
What am I missing?