I have a text file with 20 million names, each name in a line. Now I want to for example find the name "peter" in that list.
My first approach was with PyMongo and creating an index, which worked well. However, I wanted the fastest possible search.
Since the list is static and never changes, I thought it is possible to load the list in a variable and iterate it. However compared to MongoDB it is very slow. Around one search = 1s.
def initList(self): global list data = [] with open('list.txt','r') as f: for x in f: x = x.strip() data.append(x) list = [i for i, _ in enumerate(data)] print("All names loaded.")
Then the code for the search:
global list
if name in list or surname in list:
print(x)
Now my question is, am I missing something, or why is the list approach so slow? What would be the ultimate fastest solution?
My next step will be multiprocessing.