2

according to the faiss wiki page (link), you should be able to use SearchParameters to selectively include or exclude ids in a search. Now the information there is a bit strange, because the field "sel" does not exist at all. Also the names were changed, so that "SearchParametersIVFPQ" became "IVFPQSearchParameters" and the old names are no longer findable. Moreover, the search method does not even accept SearchParameters, although according to the wiki it should.

I tried to find a solution with Visual Studio's Intellisense. But this was unsuccessful...

So the documentation seems to be outdated... Does anyone know how this works today?

Christopher K
  • 37
  • 1
  • 6
  • have you checked the documentation on the GH page: https://github.com/facebookresearch/faiss/wiki/Setting-search-parameters-for-one-query? – Fucio Feb 10 '23 at 10:40

1 Answers1

3

This was driving me mad too! I've put together a small working example below. TLDR: the selector needs to be an argument to faiss.SearchParametersIVF

Let's start by creating a simple index and searching the whole thing:

import numpy as np
import faiss

# Set random seed for reproducibility
np.random.seed(0)

# Create a set of 5 small binary vectors
vectors = np.array([[1, 0, 1],
                    [0, 1, 0],
                    [1, 1, 0],
                    [0, 0, 1],
                    [1, 0, 0]])

# Initialize an index with the Hamming distance measure
index = faiss.IndexFlatL2(vectors.shape[1])

# Add vectors to the index
index.add(vectors)

# Perform a similarity search
query_vector = np.array([[1, 1, 0]], dtype=np.uint8)
k = 3  # Number of nearest neighbors to retrieve

distances, indices = index.search(query_vector, k)
print(indices)

The output when you run this is [[2 1 4]]. So the colsest vectors are at those indecies. Now let's filter out index 4 and see what happens. This is done by creating the selector and then adding it to faiss.SearchParametersIVF.

filter_ids = [0, 1, 2, 3]
id_selector = faiss.IDSelectorArray(filter_ids)
filtered_distances, filtered_indices = index.search(query_vector, k, params=faiss.SearchParametersIVF(sel=id_selector))
print(filtered_indices)

This outputs [[2 1 0]] So we removed the 4th index from the search!

nbertagnolli
  • 418
  • 5
  • 10
  • Wow, this works perfectly and is very easy to use! I had stopped a hobby project because of this problem as it was bumming me out so much, but now it would be worth continuing.... Hopefully this answer will now rank higher on Google so others can also see the solution and not give up too! Thanks, you made my day (: – Christopher K Jun 18 '23 at 09:19
  • Glad it worked! I was trying to do this too and the docs were really hard to figure out! Thanks for making the post hopefully it helps. Maybe we should make a PR to FAISS. – nbertagnolli Jun 18 '23 at 20:03
  • This answer is helpful, though I'm not sure if the `SearchParameters` classes exist in recent-but-not-newest versions of faiss. I can confirm they exist for v1.7.4 and the github issues suggest they exist for v1.7.3+. Also, the naming is different from in this answer — eg, use `faiss.IVFPQSearchParameters` instead of the acronym at the end of the name. – Tyler Aug 30 '23 at 04:46