2

Is there a way in MySQL to find close matches in a text field? Say find me@email.com when searching for me1@email.com?

So basically if a user is searching for thier own email address, and it was originally entered into the database incorrectly, it throw up a potential match if it was slightly off

BenOfTheNorth
  • 2,904
  • 1
  • 20
  • 46

2 Answers2

8

It would be a pain in the neck to solve this kind of problems with pattern matching.

The Levenshtein distance between two strings is the minimum number of operations needed to transform one string into the other, where an operation may be insertion, deletion or substitution of one character.

Check this out.

Mosty Mostacho
  • 42,742
  • 16
  • 96
  • 123
  • 1
    news to me too. i feel like this is perfect, but will be very slow for long email addresses. hopefully there arent too many times this will need to be run. – Scott M. Apr 10 '12 at 18:35
  • 1
    Just tested it out and it's working great. This will only be for resending emails manually when a user hasn't received one (They have to do this themselves), so it'll hardly ever be run – BenOfTheNorth Apr 10 '12 at 18:42
0

Try starting with some pattern matching.

SQL pattern matching enables you to use “_” to match any single character and “%” to match an arbitrary number of characters (including zero characters). In MySQL, SQL patterns are case-insensitive by default. Some examples are shown here. You do not use = or <> when you use SQL patterns; use the LIKE or NOT LIKE comparison operators instead.

Scott M.
  • 7,313
  • 30
  • 39
  • The problem with this is if I use like, it'll be too loose, and if I use regex, it'll be too tight. I need something generic that'll find anything, without the need for a rule set – BenOfTheNorth Apr 10 '12 at 17:51
  • regex can be as specific or general as you want it to be. the `LIKE` keyword can find a missing character or set of characters in the middle. If you want anything more, go with regex. You may just be using the wrong regex string. – Scott M. Apr 10 '12 at 17:53
  • I'm not using anything yet, I was assuming thats all. I think I might be able to work something with like though, if I split the email address at the @, and do a like for the first half and second half separately – BenOfTheNorth Apr 10 '12 at 17:56
  • Cheers - regex is my arch enemy, but I'll have a play! – BenOfTheNorth Apr 10 '12 at 17:57
  • 1
    @BenGriffiths Hey guys. It is very difficult to handle this kind of problems with regex, because the combination of possible words is extremely big. EG: `me1@email.com`, `m1e@email.com`, `1me@email.com`. And I'm just moving the `1`. I could move the `@` or delete or even insert more characters. It would go out of control – Mosty Mostacho Apr 10 '12 at 18:17
  • Sorry @ScottM., had to award Mosty instead, his hit the nail on the head :) – BenOfTheNorth Apr 10 '12 at 18:30