0

I am sure this subject has been touched on a lot, but after hours of googling I'm afraid I can't find the answer I need. All I am really looking for is a direction, a path to follow to learn what I need. I have a MA Access database and I am trying to write a keyword search query, but I want to get a really wide range of results from my query. It's easy enough to say description LIKE "red" but I am looking for more powerful searches. Something that would return even parts of the searched words, so if I search "clipboard" I want to find "surfboards" too. What I would really like is to be able to do a keyword search that would return a ton of results but have a way of ranking them based one relevancy to the original search. Any ideas?

DasPete
  • 831
  • 2
  • 18
  • 37
  • Use wildcards and a lot of code? :) – Fionnuala Jan 11 '13 at 14:35
  • Haha thanks. That's where I'm at now and I've been getting decent results for close word matches, but I want to get a lot more results and not quite sure how to go about it. – DasPete Jan 11 '13 at 14:42
  • You could start stripping letters off either side and run with wild cards, each letter removed reduces the match %, similarly, replace central characters with ? wildcard. See also VBA for soundex & Levenshtein : http://wiki.lessthandot.com/index.php/Comparing_Words:_Levenshtein_Distance – Fionnuala Jan 11 '13 at 14:50

1 Answers1

1

You can calculate the Levenshtein distance between words. Levenshtein distance measures the difference between two sequences. This would allow you to consider resembling words as well.

The Levenshtein distance ranges from 0 (words are equal) to length of longest word (words are completely different). You might want to transform the Levenshtein value to a more appropriate value.

Dim maxlength As Long
Dim similarity As Single

maxlength = Max(word1.Length, word2.Length)
similarity = (maxlength - levenshteinDist) / maxlength

This yields values for similarity between 0 (not similar) to 1 (equal).

You can find VBA implementations here:
Levenshtein Distance in Excel - Stack Overflow
ms access - Finding similar sounding text in VBA - Stack Overflow

Note: Access does not have a Max function. Write your own or code it explicitly.

Community
  • 1
  • 1
Olivier Jacot-Descombes
  • 104,806
  • 13
  • 138
  • 188
  • That definitely looks like a great place to start! Thanks! I'll have to dive into that a bit and see where I get. – DasPete Jan 11 '13 at 15:20
  • Levenshtein works great as a ranking system! Thanks a ton! Now I just have to work on getting more results so I have something to rank. – DasPete Jan 11 '13 at 15:48