4

I have a decent, lightweight search engine working for one of my sites using MySQL fulltext indexes and php to parse the results. Work fine but I'd like to offer more 'google-like' results with text snippets from the results and the found words highlighted. Looking for a php based solution. Any recommendations?

Martijn Laarman
  • 13,476
  • 44
  • 63
phirschybar
  • 8,357
  • 12
  • 50
  • 66

4 Answers4

5

Searching the actual database is fine until you want to add snazzy features like the one above. In my experience it is best to create a dedicated search table, with keywords and page IDs/URLs/etc. Then populate this table every n hours with content. During this population you can add snippets for each document for each keyword.

Alternatively a quick hack might be:

<?php
$text = 'This is an example text page with content. It could be red, green or blue.';
$keyword = 'red';
$size = 5; // size of snippet either side of keyword

$snippet = '...'.substr($text, strpos($text, $keyword) - $size, strpos($text, $keyword) + sizeof($keyword) + $size).'...';
$snippet = str_replace($keyword, '<strong>'.$keyword.'</strong>', $snippet);
echo $snippet;
?>
Al.
  • 2,872
  • 2
  • 22
  • 34
  • I like this solution. I have fairly light traffic so the extra processing on each search request is fine. Thanks!! – phirschybar Jul 31 '09 at 18:54
  • Might want to change strpos to stripos, and change the str_replace to a preg_replace in order to maintain case. Ie, if searching 'Burgers' you don't want the bolded text to be changed to 'burgers'. preg_replace('/('.$keyword.')/i', '$1', $snippet); – Sherri Mar 19 '17 at 22:15
3

For MySQL, your best bet would be to first split up your query words, clean up your values, and then concatenate everything back into a nice regular expression.

In order to highlight your results, you can use the <strong> tag. Its usage would be semantic as you are putting strong emphasis on an item.

// Done ONCE per page load:
  $search = "Hello World";

  //Remove the quotes and stop words
  $search = str_ireplace(array('"', 'and', 'or'), array('', '', ''), $search);

  // Get the words array
  $words = explode(' ', $search);

  // Clean the array, remove duplicates, etc.
  function remove_empty_values($value) { return trim($value) != ''; }
  function regex_escape(&$value) { $value = preg_quote($value, '/'); }
  $words = array_filter($words, 'remove_empty_values');
  $words = array_unique($words);
  array_walk($words, 'regex_escape');

  $regex = '/(' . implode('|', $words) . ')/gi';

// Done FOR EACH result
  $result = "Something something hello there yes world fun nice";
  $highlighted = preg_replace($regex, '<strong>$0</strong>', $result);

If you are using PostgreSQL, you can simply use the built-in ts_headline as described in the documentation.

adamdehaven
  • 5,890
  • 10
  • 61
  • 84
Andrew Moore
  • 93,497
  • 30
  • 163
  • 175
1

use preg_replace() (or similar function) and replace your search string with highlighted text. e.g.

$highlighted_text = preg_replace("/$search/", "<span class='highlighted'>$search</span>", $full_text);
Brian Ramsay
  • 7,536
  • 8
  • 41
  • 52
  • 1
    Highlighting isn't my problem. I am wondering about the best way to go about getting the snippet _around_ the search term. – phirschybar Jul 28 '09 at 20:13
-1

On a larger site I would think that using javascript, something like jquery would be the way to go

JasonDavis
  • 48,204
  • 100
  • 318
  • 537