2

I'm working on a search feature to search for model numbers and I'm trying to get MySQL to show me results similar to what I'm asking for but LIKE %$var% doesn't do it.

Example (we'll call table, "tbl_models"):

    id           model
+-------+--------------------+
|   1   |    DV6233SE        |
|   2   |    Studio 1440     |
|   3   |    C762NR          |
+-------+--------------------+

When searching using a search box I'm currently using:

SELECT id, model FROM tbl_models WHERE model LIKE %$var% ORDER BY id DESC

If I search for "C7" it'll return "C762NR" which is fine, but say I were to search for "C760" or "C700" or a typo of "C726NR"? Is there a way in MySQL (or PHP, JS, jQuery) that I can expand the limit of what results are returned to include different variations or close matches?

I'm not looking for someone to write it for me either, just a push in the right direction would be very helpful!

  • 1
    See http://stackoverflow.com/questions/13114398/how-can-i-match-two-strings-even-if-they-are-1-character-different/13114599 – Aea Jan 20 '13 at 05:10
  • Check [this](http://stackoverflow.com/a/3339034/913097) and make search on the subject here. There's plenty of it. – inhan Jan 20 '13 at 05:12
  • Depending on how many characters you have in the column for the code, you can use lik '%C7____%' – Taicho Jan 20 '13 at 05:17
  • It appears the Levenshtein method works when matching another preset string. I'm looking for a way to find variations on the fly in case a user incorrectly types in a model number. – Shawn Cheever Jan 20 '13 at 06:14

4 Answers4

2

If I were to apply logic for your question, I will go this way.

To find close matches -

I will take input parameter originally typed by user. e.g. 'ABCDEF' Then I will create multiple parameters from it by replacing each character in input parameter by '_'.

I.e. 'ABCDEF' will produce following input parameters for me.

'_BCDEF' , 'A_CDEF', 'AB_DEF', 'ABC_EF', 'ABCD_F', 'ABCDE_'

Then I will pass input parameters to SQL query and use OR operation to find data.

Above approach will give me words differing by 1 characters.

I can extend this a bit by replacing 2 characters with UNDERSCORE, then 3 characters then 4 and so on.

Upto how many characters I have to replace, should depend on the length of string.

Learn More
  • 1,535
  • 4
  • 29
  • 51
  • Thanks. This appears to be my best option. When a user performs a search I'll have to create an array of possible matches. I'm considering implementing an auto-complete feature with jQuery/PHP and I'll give this a try. Thank you. – Shawn Cheever Jan 20 '13 at 06:22
  • Welcome! Please post the code when you have implemented it. It will be useful to all. – Learn More Jan 20 '13 at 06:48
  • Posted the code as a new answer rather than editing the OP. Is that recommended or should I move it to the OP? – Shawn Cheever Jan 20 '13 at 20:51
  • You can optimize a little. Instead of screting an ARRAY, you can create a CSV string of all combinations. Then you can pass the CSV string to IN operator. This will make sure that you make DB-HIT once for all combinations. – Learn More Jan 21 '13 at 08:22
1

Have a look at this function... this is what you are looking for, i suppose: http://php.net/manual/en/function.levenshtein.php. This function could be an alternative, but I suggest you not to use it as results could be a little bit unpredictable: http://www.php.net/manual/en/function.similar-text.php.

The best way to implement what you are looking for is using natural languages full-text searches.

Tommaso Belluzzo
  • 23,232
  • 8
  • 74
  • 98
  • Thank you, but most of these built-in functions seem to require real words whereas the text I'm searching against are model numbers (e.g. AB-1234). – Shawn Cheever Jan 22 '13 at 14:04
1

Based on the answer provided by SaurabhV (thanks again!), I was able to create a function which takes a string and replaces each letter with an underscore in sequence. I hope this can help someone else down the road also!

function get_combination($string) {
    // Pa = Pass, Pos = Character Position, Len = String Length

    $str_arr = array($string);
    $Len = strlen($string);
    for ($Pa=0;$Pa<$Len;$Pa++) {
        for($Pos=1;($Pos+$Pa)<=$Len;$Pos++) {
            if($Pos+$Pa == $Len && $Pos<$Pa) {
                array_push($str_arr, substr_replace($string, str_repeat('_', $Pos), $Pa, 1));
            } else if($Pos+$Pa == $Len && $Pos>$Pa) {
                // End of String
            } else if($Pos == $Len || ($Len > 2 && $Pos == ($Len-1))) {
                // Do nothing - $Pos is too high
            } else if($Pos > $Len/2 && $Len > 6) {
                array_push($str_arr, substr_replace($string, str_repeat('_', $Pos-4), $Pa, $Pos-4));
            } else {
                array_push($str_arr, substr_replace($string, str_repeat('_', $Pos), $Pa, $Pos));
            }
        }
    }
    return $str_arr;
}

Example:

$string = get_combination('dv6000');

Returns:

Array ( [0] => dv6000 [1] => _v6000 [2] => __6000 [3] => ___000 [4] => ____00 [5] => d_6000 [6] => d__000 [7] => d___00 [8] => d____0 [9] => dv_000 [10] => dv__00 [11] => dv___0 [12] => dv6_00 [13] => dv6__0 [14] => dv6___ [15] => dv60_0 [16] => dv60__0 [17] => dv600_ )

Now, using MySQLi and a foreach loop I'm able to search the database against the array for similar results to what was asked. As you can see in the example, "dv6000" was the string asked but in tbl_models (see OP) there is no dv6000. Once the database hits index 14 (dv6___) it will find the correct entry:

SELECT model_number FROM tbl_models WHERE model_number LIKE %string[14]%

It's messy, probably not very efficient, but it works. If anyone can elaborate or maybe help me neaten up this code to make it more simplified, if possible, I'd appreciate it!

0

You can use PHP function as described above,or maybe SOUNDEX can help you. Look at this.

Hamlet Hakobyan
  • 32,965
  • 6
  • 52
  • 68
  • From what I've read SOUNDEX works using sounds of words, but in the case of model numbers it usually isn't words but a mix of alphanumerical characters. – Shawn Cheever Jan 20 '13 at 05:58