I have a decent, lightweight search engine working for one of my sites using MySQL fulltext indexes and php to parse the results. Work fine but I'd like to offer more 'google-like' results with text snippets from the results and the found words highlighted. Looking for a php based solution. Any recommendations?
4 Answers
Searching the actual database is fine until you want to add snazzy features like the one above. In my experience it is best to create a dedicated search table, with keywords and page IDs/URLs/etc. Then populate this table every n hours with content. During this population you can add snippets for each document for each keyword.
Alternatively a quick hack might be:
<?php
$text = 'This is an example text page with content. It could be red, green or blue.';
$keyword = 'red';
$size = 5; // size of snippet either side of keyword
$snippet = '...'.substr($text, strpos($text, $keyword) - $size, strpos($text, $keyword) + sizeof($keyword) + $size).'...';
$snippet = str_replace($keyword, '<strong>'.$keyword.'</strong>', $snippet);
echo $snippet;
?>

- 2,872
- 2
- 22
- 34
-
I like this solution. I have fairly light traffic so the extra processing on each search request is fine. Thanks!! – phirschybar Jul 31 '09 at 18:54
-
Might want to change strpos to stripos, and change the str_replace to a preg_replace in order to maintain case. Ie, if searching 'Burgers' you don't want the bolded text to be changed to 'burgers'. preg_replace('/('.$keyword.')/i', '$1', $snippet); – Sherri Mar 19 '17 at 22:15
For MySQL, your best bet would be to first split up your query words, clean up your values, and then concatenate everything back into a nice regular expression.
In order to highlight your results, you can use the <strong>
tag. Its usage would be semantic as you are putting strong emphasis on an item.
// Done ONCE per page load:
$search = "Hello World";
//Remove the quotes and stop words
$search = str_ireplace(array('"', 'and', 'or'), array('', '', ''), $search);
// Get the words array
$words = explode(' ', $search);
// Clean the array, remove duplicates, etc.
function remove_empty_values($value) { return trim($value) != ''; }
function regex_escape(&$value) { $value = preg_quote($value, '/'); }
$words = array_filter($words, 'remove_empty_values');
$words = array_unique($words);
array_walk($words, 'regex_escape');
$regex = '/(' . implode('|', $words) . ')/gi';
// Done FOR EACH result
$result = "Something something hello there yes world fun nice";
$highlighted = preg_replace($regex, '<strong>$0</strong>', $result);
If you are using PostgreSQL, you can simply use the built-in ts_headline
as described in the documentation.

- 5,890
- 10
- 61
- 84

- 93,497
- 30
- 163
- 175
use preg_replace()
(or similar function) and replace your search string with highlighted text. e.g.
$highlighted_text = preg_replace("/$search/", "<span class='highlighted'>$search</span>", $full_text);

- 7,536
- 8
- 41
- 52
-
1Highlighting isn't my problem. I am wondering about the best way to go about getting the snippet _around_ the search term. – phirschybar Jul 28 '09 at 20:13
On a larger site I would think that using javascript, something like jquery would be the way to go

- 48,204
- 100
- 318
- 537