Questions tagged [approximate-nn-searching]

Approximate Nearest neighbor search (ANNS), also known as proximity search, similarity search or closest point search, is an optimization problem for finding closest points in metric spaces.

The interest for ANN search is that exact NNS can become very costly in terms of time of execution, when the dataset is big (number of points and especially dimension (curse of dimensionality)).

As a result, we accept the trade-off between speed and accuracy. The more speed, the less accuracy. In other words, we may find the exact NN, by performing a relatively fast search.

For example, in the picture below:

enter image description here

the point marked with an asterisk is the exact NN, but the ANN search may return as the point with the number 2, which is not the actual closest point to the query, but an approximation.

For more, visit wikipedia-ANN.

44 questions

votes

5 answers

How to find the closest 2 points in a 100 dimensional space with 500,000 points?

I have a database with 500,000 points in a 100 dimensional space, and I want to find the closest 2 points. How do I do it? Update: Space is Euclidean, Sorry. And thanks for all the answers. BTW this is not homework.

asked Oct 10 '10 at 05:06

Edwin Jose Palathinkal

votes

0 answers

Nearest neighbor search over Levenshtein distance in Python using metric indexing

I would like to perform nearest-neighbor search over the Levenshtein distance with a dataset of texts in Python. The search can be approximate, should be implemented using core Python libraries, and queries should be near-constant-time operations.…

python scikit-learn nearest-neighbor levenshtein-distance approximate-nn-searching

asked May 15 '21 at 14:54

Witiko

3,167
3
25
43

votes

2 answers

Best data structure for high dimensional nearest neighbor search

I'm actually working on high dimensional data (~50.000-100.000 features) and nearest neighbors search must be performed on it. I know that KD-Trees has poor performance as dimensions grows, and also I've read that in general, all space-partitioning…

algorithm data-structures nearest-neighbor tradeoff approximate-nn-searching

asked Aug 22 '15 at 03:41

mavillan

votes

1 answer

Using an "approximate" STL map

I would like to create an STL map to find whether an item is close enough to another item in 3 dimensional space. So far, my "less-than-functor" has worked quite well, pasted to the following link. Now this problem isn't quite the "nearest neighbor"…

c++ stl dictionary approximate approximate-nn-searching

asked Mar 26 '12 at 15:21

macetw

1,640
1
17
26

votes

1 answer

Reinforcement Learning in arbitrarily large action/state spaces

I’m interested to use Deep Reinforcement Learning in order to find an - unique - optimal path back home among (too many) possibilities and a few (required) intermediate stopes (for instance, buy a coffee or refuel). Furthermore, I want to apply this…

machine-learning deep-learning artificial-intelligence reinforcement-learning approximate-nn-searching

asked Mar 13 '19 at 11:21

Matías Bilkis

votes

1 answer

Approximate string matching in Ruby

I am implementing user search functionality in my Rails application. I want the application to suggest the correct spellings if the user makes a mistake in typing the spelling. Is there any plugin for this in ruby. Can this be done in…

ruby approximate-nn-searching

asked Mar 10 '11 at 10:56

Pankaj

2,538
3
26
39

votes

1 answer

Performance of Annoy method Vs. KD-Tree

I am doing a research on approximate nearest neighbor algorithms. I recently found the Annoy Library which does an amazing job in finding KNN in reasonable speed. For deeper analysis, you can skim the meetup slides. After reading the slides and…

algorithm performance nearest-neighbor knn approximate-nn-searching

asked May 08 '16 at 23:12

lsxliron

votes

1 answer

Embedding version control of the weaviate

In Weaviate, the vector engine, I wonder how this can handle version issue of embedding model. For instance, considering the (trained) word2vec model, embedded vectors from different models must be seperated. One option might think is that make…

version-control word-embedding weaviate approximate-nn-searching

asked Feb 14 '23 at 06:22

구마왕

votes

1 answer

How filtered document list look up work in nearest neighbour search prefiltering

In pre-filter based ANN, once we have list of documents after applying pre-filter, vespa starts hsnw algorithm to find nearest neighbours. In hsnw algorithm, vespa starts with a node and look for the neighbours which are present in pre-filter list,…

vespa approximate-nn-searching

asked Aug 12 '22 at 10:22

tourism

votes

1 answer

How does FLANN select what algorithm and parameters to use?

FLANN (Fast Library for Approximate Nearest Neighbors) is a library for performing fast approximate nearest neighbor searches in high dimensional spaces. It contains a collection of algorithms we found to work best for nearest neighbor search and a…

nearest-neighbor flann approximate-nn-searching

asked Oct 19 '20 at 20:42

harkib

votes

2 answers

Populating an array from a .txt file

I'm trying to populate an array from a .txt that I am reading. I am using this code that I am using as a function to read the file: double* read_text(const char *fileName, int sizeR, int sizeC) { double* data = new double[sizeR*sizeC]; int i…

c++ arrays approximate-nn-searching

asked Dec 05 '18 at 14:57

Hparsons28

votes

1 answer

On hash functions application in cryptography vs feature

I have only started learning feature hashing so I need help in understanding if I can apply the hash function expressed mathematically as https://en.wikipedia.org/wiki/Tent_map. and one such application of Tent map is in cryptography -- Paper 1:…

svm feature-extraction dimensionality-reduction approximate-nn-searching

asked Sep 22 '16 at 01:45

Sm1

votes

1 answer

Utilizing FLANN library for Multi-Label Classification

I want to utilize the FLANN library for Mutli-Label Classification. I know that FLANN library is for computation of nearest neighbors, but I'm not sure how to utilize it for classification purposes. Is there some way to plug it in the Scikit-Learn…

machine-learning classification flann approximate-nn-searching

asked May 29 '16 at 17:54

Amol Agrawal

votes

1 answer

Search in locality sensitive hashing

I'm trying to understand the section 5. of this paper about LSH, in particular how to bucket the generated hashes. Quoting the linked paper: Given bit vectors consisting of d bits each, we choose N = O(n 1/(1+epsilon) ) random permutations of the…

algorithm computational-geometry nearest-neighbor locality-sensitive-hash approximate-nn-searching

asked May 22 '16 at 16:52

justHelloWorld

6,478
8
58
138

votes

1 answer

What is the ε (epsilon) parameter in Locality Sensitive Hashing (LSH)?

I've read the original paper about Locality Sensitive Hashing. The complexity is in function of the parameter ε, but I don't understand what it is. Can you explain its meaning please?

computational-geometry nearest-neighbor approximation locality-sensitive-hash approximate-nn-searching

asked May 21 '16 at 03:08

justHelloWorld

6,478
8
58
138

2 3 Next