0

I want to split a sentence into a paragraph and each paragraph should have less than numbers of words. For example:

Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source. 

Paragraph 1: 
Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old.

Paragraph 2: 
Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source. 

In the above example, words less than 20 is in a paragraph 1 and rest are on Paragraph 2.

Is there any way to achieve this using php ?

I have tried $abc = explode(' ', $str, 20); which will store 20 words in a array then the rest of them to last array $abc['21']. How could I extract data from first 20 array as the first paragraph then the rest as the second paragraph ?

Athafoud
  • 2,898
  • 3
  • 40
  • 58
MahiloDai
  • 63
  • 1
  • 3
  • 10
  • Your last paragraph 'I have tried ...' is completely wrong, please rephrase it. – Athafoud Jun 30 '14 at 11:47
  • you can try converting your string into an array and then store the first 20 characters in one string and the rest into another. – Aradhna Jun 30 '14 at 11:47
  • Simply use `implode` after `explode`ing your sentence. http://stackoverflow.com/questions/5956610/how-to-select-first-10-words-of-a-sentence – TribalChief Jun 30 '14 at 11:49
  • 2
    How is your first paragraph less than 20 words? It is clearly more than that in your example. – hwnd Jun 30 '14 at 11:51

1 Answers1

1

First split string into sentences. Then loop over sentences array, start by adding the sentence to a paragraphs array, then count the words in that element of the paragraphs array, if greater than 19 increment paragraph counter.

$string = 'Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source.';

$sentences = preg_split('/(?<=[.?!;])\s+(?=\p{Lu})/', $string);

$ii = 0;
$paragraphs = array();
foreach ( $sentences as $value ) {
    if ( isset($paragraphs[$ii]) ) { $paragraphs[$ii] .= $value; }
    else { $paragraphs[$ii] = $value; }
    if ( 19 < str_word_count($paragraphs[$ii]) ) {
        $ii++;
    }
}
print_r($paragraphs);

Output:

Array
(
    [0] => Contrary to popular belief, Lorem Ipsum is not simply random text.It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old.
    [1] => Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source.
)

Sentence splitter found here: Splitting paragraphs into sentences with regexp and PHP

Community
  • 1
  • 1
bloodyKnuckles
  • 11,551
  • 3
  • 29
  • 37