I'm making a simple search engine, and I have already indexed a lot of websites in a MySQL database. Now I would like to get a relevant list of results by keywords.
Websites are indexed in my database with the following columns : hostname (without protocol an port), title, description. (We don't care about path)
When I type some keywords on my search engine homepage, it first starts by fetching 50 websites using FULLTEXT indexes.
Now, and because using Levenshtein algorithm in MySQL is really slow, I would like to sort those results with the Levenshtein PHP function for each columns I listed previously.
I would like to sort them in this order (most important first) : hostname, title, and then description.
So I have five arrays :
- $results, returned by MySQL
- $sorted_by_mysql, containing keys of $results in the original order : 0, 1, 2, ...
- $sorted_by_hostname, containing keys of $results sorted by hostname's relevance using Levenshtein, ex: 3, 0, 1, 2, ...
- $sorted_by_title, containing keys of $results sorted by title's relevance using Levenshtein, ex: 0, 2, 1, 3, ...
- $sorted_by_description, containing keys of $results sorted by description's relevance using Levenshtein, ex: 1, 3, 0, 2, ...
Here's the code :
$results = $req->fetchAll();
$search = strtolower($q);
$temp_arr = [];
$sorted_by_mysql = $sorted_by_hostname = $sorted_by_title = $sorted_by_description = [];
// We keep the original order in an array
for($i = 0; $i < count($results); $i++) $sorted_by_mysql[] = $i;
// Sort by hostname
for($i = 0; $i < count($results); $i++) $temp_arr[$i] = levenshtein($search, strtolower($results[$i]->hostname));
asort($temp_arr);
foreach($temp_arr as $k => $v) $sorted_by_hostname[] = $k;
// Sort by title
for($i = 0; $i < count($results); $i++) $temp_arr[$i] = levenshtein($search, strtolower($results[$i]->title));
asort($temp_arr);
foreach($temp_arr as $k => $v) $sorted_by_title[] = $k;
// Sort by description
for($i = 0; $i < count($results); $i++) $temp_arr[$i] = levenshtein($search, strtolower($results[$i]->description));
asort($temp_arr);
foreach($temp_arr as $k => $v) $sorted_by_description[] = $k;
Finally I would like to sort $results by combining (by priority) all thoses different arrays. But I have no idea on how, so here's where I need some help !
EDIT : Solution !
$data = $req->fetchAll();
$search = strtolower($q);
$temp = [];
foreach($data as $i => $row) {
$temp[] = [
'id' => $i,
'lev1' => levenshtein($search, strtolower($row->hostname)),
'lev2' => levenshtein($search, strtolower($row->title)),
'lev3' => levenshtein($search, strtolower($row->description))
];
}
$sorted = array_orderby($temp, 'lev1', SORT_ASC, 'lev2', SORT_ASC, 'lev3', SORT_ASC, 'id', SORT_ASC);
$results = [];
foreach($sorted as $row) {
$results[] = $data[$row['id']];
}
// Perfectly sorted !
Here's array_orderby function :
// Credits : jimpoz at jimpoz dot com (PHP.net)
function array_orderby()
{
$args = func_get_args();
$data = array_shift($args);
foreach ($args as $n => $field) {
if (is_string($field)) {
$tmp = array();
foreach ($data as $key => $row)
$tmp[$key] = $row[$field];
$args[$n] = $tmp;
}
}
$args[] = &$data;
call_user_func_array('array_multisort', $args);
return array_pop($args);
}