Questions tagged [fuzzy-search]

A search mechanism where the objective is to find all approximate, relevant or possibly relevant results for the search-key rather than finding an exact match.

Fuzzy search is a search mechanism based on , where the objective is to find all approximate, relevant or possibly relevant results for keywords rather than finding an exact match. This allows for matches even where the keywords are misspelled or only hint at a concept.


Related tags

954 questions
178
votes
12 answers

Javascript fuzzy search that makes sense

I'm looking for a fuzzy search JavaScript library to filter an array. I've tried using fuzzyset.js and fuse.js, but the results are terrible (there are demos you can try on the linked pages). After doing some reading on Levenshtein distance, it…
willlma
  • 7,353
  • 2
  • 30
  • 45
161
votes
25 answers

A better similarity ranking algorithm for variable length strings

I'm looking for a string similarity algorithm that yields better results on variable length strings than the ones that are usually suggested (levenshtein distance, soundex, etc). For example, Given string A: "Robert", Then string B: "Amy…
marzagao
  • 3,756
  • 4
  • 19
  • 14
85
votes
10 answers

Fuzzy matching using T-SQL

I have a table Persons with personaldata and so on. There are lots of columns but the once of interest here are: addressindex, lastname and firstname where addressindex is a unique address drilled down to the door of the apartment. So if I have…
Frederik
  • 2,178
  • 4
  • 20
  • 20
84
votes
8 answers

Fuzzy string search library in Java

I'm looking for a high performance Java library for fuzzy string search. There are numerous algorithms to find similar strings, Levenshtein distance, Daitch-Mokotoff Soundex, n-grams etc. What Java implementations exists? Pros and cons for them? I'm…
dario
  • 47
  • 1
  • 4
  • 5
76
votes
6 answers

Fuzzy search algorithm (approximate string matching algorithm)

I wish to create a fuzzy search algorithm. However, upon hours of research I am really struggling. I want to create an algorithm that performs a fuzzy search on a list of names of schools. This is what I have looked at so far: Most of my research…
Yahya Uddin
  • 26,997
  • 35
  • 140
  • 231
73
votes
6 answers

Checking fuzzy/approximate substring existing in a longer string, in Python?

Using algorithms like leveinstein ( leveinstein or difflib) , it is easy to find approximate matches.eg. >>> import difflib >>> difflib.SequenceMatcher(None,"amazing","amaging").ratio() 0.8571428571428571 The fuzzy matches can be detected by…
DhruvPathak
  • 42,059
  • 16
  • 116
  • 175
64
votes
4 answers

Opening files in Vim using Fuzzy Search

I'm looking for a way to make Vim have the ability to open a file by fuzzy-searching its name. Basically, I want to be able to define a project once, and then have a shortcut which will give me a place to type a file name, and will match if any…
Edan Maor
  • 9,772
  • 17
  • 62
  • 92
60
votes
9 answers

How do I do a fuzzy match of company names in MYSQL with PHP for auto-complete?

My users will import through cut and paste a large string that will contain company names. I have an existing and growing MYSQL database of companies names, each with a unique company_id. I want to be able to parse through the string and assign to…
AFG
  • 1,675
  • 3
  • 22
  • 23
52
votes
6 answers

Fuzzy Regular Expressions

In my work I have with great results used approximate string matching algorithms such as Damerau–Levenshtein distance to make my code less vulnerable to spelling mistakes. Now I have a need to match strings against simple regular expressions such TV…
Thomas Ahle
  • 30,774
  • 21
  • 92
  • 114
47
votes
2 answers

How to create simple fuzzy search with PostgreSQL only?

I have a little problem with search functionality on my RoR based site. I have many Produts with some CODEs. This code can be any string like "AB-123-lHdfj". Now I use ILIKE operator to find products: Product.where("code ILIKE ?", "%" +…
Alve
  • 1,315
  • 2
  • 17
  • 16
42
votes
7 answers

How can I match fuzzy match strings from two datasets?

I've been working on a way to join two datasets based on a imperfect string, such as a name of a company. In the past I had to match two very dirty lists, one list had names and financial information, another list had names and address. Neither had…
A L
  • 613
  • 1
  • 7
  • 7
41
votes
5 answers

Real world typo statistics?

Where can I find some real world typo statistics? I'm trying to match people's input text to internal objects, and people tend to make spelling mistakes. There are 2 kinds of mistakes: typos - "Helllo" instead of "Hello" / "Satudray" instead of…
Tal Weiss
  • 8,889
  • 8
  • 54
  • 62
39
votes
1 answer

How can I create an index with pymongo

I want to enable text-search at a specific field in my Mongo DB. I want to implement this search in python (-> pymongo). When I follow the instructions given in the internet: db.foo.ensure_index(('field_i_want_to_index', 'text'),…
Maximilian
  • 1,325
  • 2
  • 14
  • 35
38
votes
2 answers

Fuzzy search box widget with `Shiny` in R?

Has anyone created or seen a Shiny app featuring search box widget giving contextual suggestions as you type, based on fuzzy matching? Bloomberg terminal uses it, Google uses it. One of the possible underlying technologies is called…
Daniel Krizian
  • 4,586
  • 4
  • 38
  • 75
33
votes
2 answers

Best Fuzzy Matching Algorithm?

What is the best Fuzzy Matching Algorithm (Fuzzy Logic, N-Gram, Levenstein, Soundex ....,) to process more than 100000 records in less time?
Dhanapal
  • 14,239
  • 35
  • 115
  • 142
1
2 3
63 64