1

I have a search result that strictly counts the number of characters before and after the SEARCH TERM when cutting off the full string. Unfortunately, this causes the output to cut off words in the middle. (...with an ellipse before and after the counting)

I am trying to have the search result cut off the full string ONLY at white space vs. in the middle of a word.

Here is the function:

private function _highlight_results(){

    $GLOBALS['_SEARCH_SUMMARY_LENGTH'] = 24;

    foreach($this->results as $url => &$this_result){
        if(!$this_result['url_display'] && $this_result['url']){
            $this_result['url_display'] = $this_result['url'];
        }
        foreach($this_result['search_term'] as $search_term){
            $search_term = preg_quote($search_term,'/');

            foreach(array('title','summary','url_display') as $highlight_item){
                if($this_result[$highlight_item] && preg_match('/'.$search_term.'/i',$this_result[$highlight_item])){
                    if($highlight_item != 'url_display' && strlen($this_result[$highlight_item]) > $GLOBALS['_SEARCH_SUMMARY_LENGTH']){
                        $boobs = ceil(($GLOBALS['_SEARCH_SUMMARY_LENGTH']-strlen($this->_search_term))/2);
                        preg_match('/(.{0,'.$boobs.'})('.$search_term.')(.{0,'.$boobs.'})/i',$this_result[$highlight_item],$matches);
                        // want to even out the strings a bit so if highlighted term is at end of string, put more characters infront.
                        $before_limit = $after_limit = ($boobs - 2);
                        if(strlen($matches[1])>=$before_limit && strlen($matches[3])>=$after_limit){
                            // leave limit alone.
                        }else if(strlen($matches[1])<$before_limit){
                            $after_limit += $before_limit - strlen($matches[1]);
                            $before_limit = strlen($matches[1]);
                            preg_match('/(.{0,'.($before_limit+2).'})('.$search_term.')(.{0,'.($after_limit+2).'})/i',$this_result[$highlight_item],$matches);
                        }else if(strlen($matches[3])<$after_limit){
                            $before_limit += $after_limit - strlen($matches[3]);
                            $after_limit = strlen($matches[3]);
                            preg_match('/(.{0,'.($before_limit+2).'})('.$search_term.')(.{0,'.($after_limit+2).'})/i',$this_result[$highlight_item],$matches);
                        }
                        $this_result[$highlight_item] = (strlen($matches[1])>$before_limit) ? '...'.substr($matches[1],-$before_limit) : $matches[1];
                        $this_result[$highlight_item] .= $matches[2];
                        $this_result[$highlight_item] .= (strlen($matches[3])>$after_limit) ? substr($matches[3],0,$after_limit).'...' : $matches[3];

                    }

                }else if(strlen($this_result[$highlight_item]) > $GLOBALS['_SEARCH_SUMMARY_LENGTH']){
                    $this_result[$highlight_item] = substr($this_result[$highlight_item],0,$GLOBALS['_SEARCH_SUMMARY_LENGTH']).'...';
                }
            }
        }

        foreach($this_result['search_term'] as $search_term){
            $search_term = preg_quote($search_term,'/');

            foreach(array('title','summary','url_display') as $highlight_item){
                $this_result[$highlight_item] = preg_replace('/'.$search_term.'/i','<span id="phpsearch_resultHighlight">$0</span>',$this_result[$highlight_item]);
            }
        }
    }
}

Here's what I was thinking... Just before displaying the string output, the script should loop through the string using a function that 'looks for' an ellipse and an immediate character and then removes the character AFTER and continues looping until a white space is found. Then, the next loop would 'look for' a character and then an ellipse and then removes the character and continues looping until a white space is found BEFORE the ellipse.

Here's some very sad pseudo code of my description above:

WHILE (not the end of the string) {
 // NOT SURE IF I NEED A FOREACH LOOP HERE TO CHECK EACH CHAR
    IF ( ^ ('...' and an immediate char are found) ) {
           delete chars until a white space is found;

            // if '...' is deleted along with the chars, then put the '...' back in:
            //string .= '...' . string;
    }
    IF ( $ (a char and an immediate '...' are found) ) {
           delete chars until a white space is found;

            // if '...' is deleted along with the chars, then put the '...' back in:
            //string .= string . '...';
    }
}
PRINT string;

I think you can get the idea of what I'm looking for from the stuff above. I have researched and tested wordwrap() but still have not found THE answer.

mar2195
  • 103
  • 1
  • 10
  • I suggest you take a search for wordwrap with regular expressions on this site which should give you some pointers as it's similar. – hakre Apr 25 '12 at 08:52
  • Try This Link, May help You... http://stackoverflow.com/a/26098951/3944217 – Edwin Thomas Sep 29 '14 at 12:08

1 Answers1

0

Here's an approach that should work fine and also be quite performant. The only drawback is that it breaks words only on spaces as it stands, and this cannot be trivially fixed because there is no strrspn function to complement strspn (but one could be easily written and used to extend this solution).

function display_short($str, $limit, $ellipsis = '...') {
    // if all of it fits there's nothing to do
    if (strlen($str) <= $limit) {
        return $str;
    }

    // $ellipsis will count towards $limit
    $limit -= strlen($ellipsis);

    // find the last space ("word boundary")
    $pos = strrpos($str, ' ', $limit - strlen($str));

    // if none found, prefer breaking into the middle of
    // "the" word instead of just giving up
    if ($pos === false) {
        $pos = $limit;
    }

    return substr($str, 0, $pos).$ellipsis;
}

Test with:

$string = "the quick brown fox jumps over the lazy dog";
for($limit = 10; $limit <= strlen($string); $limit += 10) {
    print_r(display_short($string, $limit));
}

See it in action.

Jon
  • 428,835
  • 81
  • 738
  • 806
  • I have tried this function but receive this error: Call to undefined function display_short() ... any ideas? – mar2195 Apr 25 '12 at 19:17
  • At this point, I think the simplest way to handle this is to ask someone for a regex something like this: FOREACH CHAR { preg_replace( SEARCH FROM BEGINNING OF STRING -then- FIND '...' -then- IF CHAR IS NOT WHITE SPACE, DELETE CHAR (or move on to FIND the white space?) -then- IF CHAR IS WHITE SPACE, BREAK ) } and then another preg_replace for the ENDING '...' – mar2195 Apr 25 '12 at 19:42
  • @user1355539: I think the simplest way is to compare the live example on Ideone with your code and see where you went wrong. It can't be that hard. :) – Jon Apr 25 '12 at 22:18
  • I figured out the solution for my posting. Just before any output is displayed, add this code: [code] for($i=0;$i0;$i--){ if($summary_edit[$i] != ' '){ $summary_edit[$i] = ''; } else { break; } } echo '... '.$summary_edit.' ...'; [/code] – mar2195 Apr 25 '12 at 22:49