13

My variable $content contains my text. I want to create an excerpt from $content and display the first sentence and if the sentence is shorter than 15 characters, I would like to display the second sentence.

I've already tried stripping first 50 characters from the file, and it works:

<?php echo substr($content, 0, 50); ?>

But I'm not happy with results (I don't want any words to be cut).

Is there a PHP function getting the whole words/sentences, not only substr?

Thanks a lot!

NotJay
  • 3,919
  • 5
  • 38
  • 62
anonymous
  • 1,511
  • 7
  • 26
  • 37
  • *(related)* [Truncate a multibyte String to n chars](http://stackoverflow.com/questions/2154220/truncate-a-multibyte-string-to-n-chars). The solution there cuts with respect to word boundaries. It's a duplicate if you dont care about the sentences but only words. – Gordon Jan 14 '11 at 14:34
  • possible duplicate: http://stackoverflow.com/questions/79960/how-to-truncate-a-string-in-php-to-the-word-closest-to-a-certain-number-of-charac – jasonbar Jan 14 '11 at 14:35

9 Answers9

14

I figured it out and it was pretty simple though:

<?php
    $content = "My name is Luka. I live on the second floor. I live upstairs from you. Yes I think you've seen me before. ";
    $dot = ".";

    $position = stripos ($content, $dot); //find first dot position

    if($position) { //if there's a dot in our soruce text do
        $offset = $position + 1; //prepare offset
        $position2 = stripos ($content, $dot, $offset); //find second dot using offset
        $first_two = substr($content, 0, $position2); //put two first sentences under $first_two

        echo $first_two . '.'; //add a dot
    }

    else {  //if there are no dots
        //do nothing
    }
?>
Nifle
  • 11,745
  • 10
  • 75
  • 100
anonymous
  • 1,511
  • 7
  • 26
  • 37
  • 8
    Breaks for "My name is Luka. I was born 1.1.1953 in New York." => "My name is Luka. I was born 1." – Tomáš Fejfar Nov 10 '12 at 12:35
  • 1
    @TomášFejfar In that case, change `$dot = "."` to `$dot = ". "` (Add a space after the period) – NotJay Dec 23 '15 at 14:24
  • As a side note, if you have exclamation points that aren't being accounted for, you can do a `str_replace` to replace them with periods. `$content = str_replace('! ', '. ', $content);` – NotJay Dec 23 '15 at 14:58
  • These solutions are just not working in all scenarios. How about the question mark? How about the situation when a user mispell a string, and start a new sentence immediately after the dot?Or after the question mark :P Or using smiley for ending a sentence. These methods can only be used on those sentences you put in, but in that case you exactly know the first 2 sentences, so just pointless. – err Dec 18 '20 at 15:14
9

Here's a quick helper method that I wrote to get the first N sentences of a given body of text. It takes periods, question marks, and exclamation points into account and defaults to 2 sentences.

function tease($body, $sentencesToDisplay = 2) {
    $nakedBody = preg_replace('/\s+/',' ',strip_tags($body));
    $sentences = preg_split('/(\.|\?|\!)(\s)/',$nakedBody);

    if (count($sentences) <= $sentencesToDisplay)
        return $nakedBody;

    $stopAt = 0;
    foreach ($sentences as $i => $sentence) {
        $stopAt += strlen($sentence);

        if ($i >= $sentencesToDisplay - 1)
            break;
    }

    $stopAt += ($sentencesToDisplay * 2);
    return trim(substr($nakedBody, 0, $stopAt));
}
broox
  • 3,538
  • 33
  • 25
6

I know this is an old post but I was looking for the same thing.

preg_match('/^([^.!?]*[\.!?]+){0,2}/', strip_tags($text), $abstract);
echo $abstract[0];
mathius1
  • 1,381
  • 11
  • 17
6

There is one for words - wordwrap

Example Code:

<?php

for ($i = 10; $i < 26; $i++) {
    $wrappedtext = wordwrap("Lorem ipsum dolor sit amet", $i, "\n");
    echo substr($wrappedtext, 0, strpos($wrappedtext, "\n")) . "\n";
}

Output:

Lorem
Lorem ipsum
Lorem ipsum
Lorem ipsum
Lorem ipsum
Lorem ipsum
Lorem ipsum
Lorem ipsum dolor
Lorem ipsum dolor
Lorem ipsum dolor
Lorem ipsum dolor
Lorem ipsum dolor sit
Lorem ipsum dolor sit
Lorem ipsum dolor sit
Lorem ipsum dolor sit
Lorem ipsum dolor sit
Paul
  • 2,972
  • 2
  • 21
  • 16
  • 2
    `wordwrap` does not truncate strings but just inserts line breaks at a certain position. `mb_strimwidth` would truncate, but it does not obey word boundaries. – Gordon Jan 14 '11 at 14:40
  • 1
    yes, you are right... sorry for that one... BUT you could do something like substr($wrappedtext, 0, strpos($wrappedtext, $delimiter)) :) – Paul Jan 14 '11 at 14:53
  • @Paul which would still not obey word boundaries – Gordon Jan 14 '11 at 14:57
  • just tried it... it really does take care of word boundaries! you must not pass true as 4th argument... – Paul Jan 14 '11 at 15:07
  • @Paul please [update your answer](http://stackoverflow.com/posts/4692064/edit) with example code (that's good habit on SO anyway) to prove your point. I'll remove the downvote if it does, but I'd also be really suprised then. – Gordon Jan 14 '11 at 15:09
  • 1
    @Paul that will fail if there is newline characters already in the source string at an earlier position. Try "Lorem \n ipsum dolor sit amet". `wordwrap` does obey word boundaries, but `strpos` doesnt. – Gordon Jan 14 '11 at 15:24
4

For me the following worked:

$sentences = 2;
echo implode('. ', array_slice(explode('.', $string), 0, $sentences)) . '.';
michalzuber
  • 5,079
  • 2
  • 28
  • 29
4

I wrote a function to do something similar to this on one of our websites. I'm sure it could be tweaked to get your exact result out of it.

Basically, you give it a string of text and the amount of words you want to have it trim to. It will then trim to that amount of words. If the last word it finds doesn't end the sentence, it will continue over the amount of words you specified until it reaches the end of the sentence. Hope it helps!

//This function intelligently trims a body of text to a certain
//number of words, but will not break a sentence.
function smart_trim($string, $truncation) {
    $matches = preg_split("/\s+/", $string);
    $count = count($matches);

    if($count > $truncation) {
        //Grab the last word; we need to determine if
        //it is the end of the sentence or not
        $last_word = strip_tags($matches[$truncation-1]);
        $lw_count = strlen($last_word);

        //The last word in our truncation has a sentence ender
        if($last_word[$lw_count-1] == "." || $last_word[$lw_count-1] == "?" || $last_word[$lw_count-1] == "!") {
            for($i=$truncation;$i<$count;$i++) {
                unset($matches[$i]);
            }

        //The last word in our truncation doesn't have a sentence ender, find the next one
        } else {
            //Check each word following the last word until
            //we determine a sentence's ending
            for($i=($truncation);$i<$count;$i++) {
                if($ending_found != TRUE) {
                    $len = strlen(strip_tags($matches[$i]));
                    if($matches[$i][$len-1] == "." || $matches[$i][$len-1] == "?" || $matches[$i][$len-1] == "!") {
                        //Test to see if the next word starts with a capital
                        if($matches[$i+1][0] == strtoupper($matches[$i+1][0])) {
                            $ending_found = TRUE;
                        }
                    }
                } else {
                    unset($matches[$i]);
                }
            }
        }

        //Check to make sure we still have a closing <p> tag at the end
        $body = implode(' ', $matches);
        if(substr($body, -4) != "</p>") {
            $body = $body."</p>";
        }

        return $body; 
    } else {
        return $string;
    }
}
Michael Irigoyen
  • 22,513
  • 17
  • 89
  • 131
2

This would make sure it never returned a half-word;

$short = substr($content, 0, 100);
$short = explode(' ', $short);
array_pop($short);
$short = implode(' ', $short);
print $short;
Matt Lowden
  • 2,586
  • 17
  • 19
  • `$summary = implode(' ',array_pop(explode(' ', substr($content, 0,500))));` `$afterSummary = implode(' ',array_shift(explode(' ', substr($summary, 500))));` Thanks – CrandellWS Jul 27 '15 at 04:23
  • though my code comment does not work out the box you should be able to short it out... – CrandellWS Jul 27 '15 at 04:31
1

Here's a function modified from another I found online; it strips out any HTML, and cleans up some funky MS characters first; it then adds in an optional ellipsis character to the content to show that it's been shortened. It correctly splits at a word, so you won't have seemingly random characters;

/**
 * Function to ellipse-ify text to a specific length
 *
 * @param string $text   The text to be ellipsified
 * @param int    $max    The maximum number of characters (to the word) that should be allowed
 * @param string $append The text to append to $text
 * @return string The shortened text
 * @author Brenley Dueck
 * @link   http://www.brenelz.com/blog/2008/12/14/creating-an-ellipsis-in-php/
 */
function ellipsis($text, $max=100, $append='&hellip;') {
    if (strlen($text) <= $max) return $text;

    $replacements = array(
        '|<br /><br />|' => ' ',
        '|&nbsp;|' => ' ',
        '|&rsquo;|' => '\'',
        '|&lsquo;|' => '\'',
        '|&ldquo;|' => '"',
        '|&rdquo;|' => '"',
    );

    $patterns = array_keys($replacements);
    $replacements = array_values($replacements);


    $text = preg_replace($patterns, $replacements, $text); // convert double newlines to spaces
    $text = strip_tags($text); // remove any html.  we *only* want text
    $out = substr($text, 0, $max);
    if (strpos($text, ' ') === false) return $out.$append;
    return preg_replace('/(\W)&(\W)/', '$1&amp;$2', (preg_replace('/\W+$/', ' ', preg_replace('/\w+$/', '', $out)))) . $append;
}

Input:

<p class="body">The latest grocery news is that the Kroger Co. is testing a new self-checkout technology. My question is: What&rsquo;s in it for me?</p> <p>Kroger said the system, from Fujitsu,

Output:

The latest grocery news is that the Kroger Co. is testing a new self-checkout technology. My question is: What's in it for me? Kroger said the …

Glen Solsberry
  • 11,960
  • 15
  • 69
  • 94
-2

If I were you, I'd choose to pick only the first sentence.

$t='Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Vestibulum justo eu leo.'; //input text
$fp=explode('. ',$t); //first phrase
echo $fp[0].'.'; //note I added the final ponctuation

This would simplyfy things a lot.

Roger
  • 8,286
  • 17
  • 59
  • 77