I'm learning Faiss
and trying to build an IndexFlatIP
quantizer for an IndexIVFFlat
index with 4000000
arrays with d = 256
.
My code is as follows:
import numpy as np
import faiss
d = 256 # Dimension of each feature vector
n = 4000000 # Number of vectors
cells = 100 # Number of Voronoi cells
embeddings = np.random.rand(n, d)
quantizer = faiss.IndexFlatIP(d)
index = faiss.IndexIVFFlat(quantizer, d, cells)
index.train(embeddings) # Train the index
The code above works great, but when it comes to adding the embeddings to the index:
index.add(embeddings)
I get the following exception:
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 4.01 GiB for an array with shape (4000000, 256) and data type float32
Seeing as this is a numpy
memory error, does it mean my index does not fit in memory? The machine I am using has 20.0GB. If so, how can I work around this issue and correctly configure my index so that it fits into memory?