1

I have a long string of text. I want to store it in an array by 2 sentences per element. I think it should be done by exploding the text around dot+space; however, there are elements like 'Mr.' which I don't know how to exclude from the explode function.

I also don't know how to adjust it to explode the text by 2 sentences, not by 1.

Liftoff
  • 24,717
  • 13
  • 66
  • 119
user3078775
  • 87
  • 1
  • 3

2 Answers2

0

maybe something like:

$min_sentence_length = 100;

$ignore_words = array('mr.','ms.');

$text = "some texing alsie urj skdkd. and siks ekka lls. lorem ipsum some.";

$parts = explode(" ", $text);

$sentences = array();

$cur_sentence = "";

foreach($parts as $part) {

  // Check sentence min length and is there period 
  if (strlen($cur_sentence) > $min_sentence_length && 
    substr($part,-1) == "." && !in_array($part, $ignore_words)) {

    $sentences[] = $cur_sentence;
    $cur_sentence = "";
  }

  $cur_sentence .= $part . " ";   
}

if (strlen($cur_sentence) > 0)
  $sentences[] = $cur_sentence;
Hardy
  • 5,590
  • 2
  • 18
  • 27
0

The comments on your question link to answers that use preg_split() instead of explode() to provide more accurate description of how and when to split the input. That might work for you. Another approach would be to split your input on every occurrence of ". " into a temporary array, then loop through that array, piecing it back together however you like. e.g.

$tempArray = explode('. ', $input);

$outputArray = array();
$outputElement = '';
$sentenceCount = 0;

foreach($tempArray as $part){
  $outputElement .= $part . '. ';

  //put other exceptions here, not just "Mr."
  if ($part != 'Mr'){
    $sentenceCount++;
  }

  if ($senteceCount == 2){
    $outputArray[] = $outputElement;
    $outputElement = '';
    $sentenceCount = 0;
  }
}
Brad
  • 229
  • 2
  • 9