1

I'm creating a lookup dictionary with the help of other dictionaries. I have 32 cores and 64GB RAM instance where I'm running it. names is a huge numpy array with around 28184057 elements. When I try to apply pool.map on 'names', I get the following error:

struct.error: 'i' format requires -2147483648 <= number <= 2147483647

This is my code:

pool = mp.Pool(32)
uni_lookup = pool.map(self.create_lookup_dict, names)
pool.close()

And create_lookup_dict contains the following code:

def create_lookup_dict(self, word):
    lookup_word_level = {word: list(set(i[0] for i in self.groups.items() \
                                if word in i[1]))}
    final = []

    if self.tag_dict.get(word, 0) is 0:
        print("No tag")
        return (word, lookup_word_level[word])
    if self.secondary_category_tag_dict.get(word, 0) is 0:
        print("No secondary category tag")
        return (word, lookup_word_level[word])
    for i in lookup_word_level[word]:
        if self.tag_dict[i].intersection(self.tag_dict[word]) and \
        self.secondary_category_tag_dict[i].intersection(self.secondary_category_tag_dict[word]):
            final.append(i)
    return (word, final)

What am I missing?

Mayur Bhangale
  • 405
  • 1
  • 5
  • 19

0 Answers0