3

I have a faiss index and want to use some of the embeddings in my python script. Selection of Embeddings should be done by id. As faiss is written in C++, swig is used as an API.

I guess the function I need is reconstruct :

/** Reconstruct a stored vector (or an approximation if lossy coding)
     *
     * this function may not be defined for some indexes
     * @param key         id of the vector to reconstruct
     * @param recons      reconstucted vector (size d)
     */
    virtual void reconstruct(idx_t key, float* recons) const;

Therefore, I call this method in python, for example:

vector = index.reconstruct(0)

But this results in the following error:

vector = index.reconstruct(0) File "lib/python3.8/site-packages/faiss/init.py", line 406, in replacement_reconstruct self.reconstruct_c(key, swig_ptr(x)) File "lib/python3.8/site-packages/faiss/swigfaiss.py", line 1897, in reconstruct return _swigfaiss.IndexFlat_reconstruct(self, key, recons)

TypeError: in method 'IndexFlat_reconstruct', argument 2 of type 'faiss::Index::idx_t' python-BaseException

Has someone an idea what is wrong with my approach?

chefhose
  • 2,399
  • 1
  • 21
  • 32
  • I guess `reconstruct` replaces a vector in an index. Seems like It requires a vector as a second parameter – bottledmind Jan 10 '22 at 11:41
  • reconstruct() works for me. Maybe you didn't install faiss properly. You better install it by conda instead of pip – Hua Nov 10 '22 at 08:56

2 Answers2

3

This is the only way I found manually.

import faiss
import numpy as np

a = np.random.uniform(size=30)
a = a.reshape(-1,10).astype(np.float32)
d = 10
index = faiss.index_factory(d,'Flat', faiss.METRIC_L2)
index.add(a)

xb = index.xb
print(xb.at(0) == a[0][0])

Output:

True

You can get any vector with a loop

required_vector_id = 1
vector = np.array([xb.at(required_vector_id*index.d + i) for i in range(index.d)])
    
print(np.all(vector== a[1]))

Output:

True
bottledmind
  • 603
  • 3
  • 10
1

You can get all the embeddings that you added to an index by using this,

# Number of docs added to your index
num_docs = index.ntotal
# Get the dimension of your embeddings
embedding_dimension = index.d

embeddings = faiss.rev_swig_ptr(index.get_xb(), num_docs*embedding_dimension).reshape(num_docs, embedding_dimension)

Reference