0

$chapter is a string that stores a chapter of a book with 10,000 - 15,000 characters. I want to break up the string into segments with a minimum of 1000 characters but officially break after the next whitespace, so that I don't break up a word. The provided code will run successfully about 9 times and then it will run into a run time issue.

"Fatal error: Maximum execution time of 30 seconds exceeded in D:\htdocs\test.php on line 16"

<?php
$chapter = ("10000 characters")
$len = strlen($chapter);
$i=0; 
do{$key="a";
  for($k=1000;($key != " ") && ($i <= $len); $k = $k+1) {
    $j=$i+$k; echo $j;
    $key = substr($chapter,$j,1);
  }
  $segment =  substr ($chapter,$i,$k);
  $i=$j;
echo ($segment);
} while($i <= $len);
?>
Jon Le
  • 83
  • 9
  • Why are you doing that? How are you getting the 'chapters'? – putvande Aug 28 '13 at 17:09
  • [been asked before and is easy to search on google][1] [1]: http://stackoverflow.com/questions/16171132/how-to-increase-maximum-execution-time-in-php – 3seconds Aug 28 '13 at 17:13

5 Answers5

1

I think your method of writing it has too much overhead, while increasing max_execution_time will help, not everyone is able to modify their server settings. This simple thing split 15000 bytes of lorum ipsum text (2k Words) into 1000 character segments. I assume it would do well with more, as the execution time was fairly quick.

//Define variables, Set $x as int(1 = true) to start
$chapter = ("15000 bytes of Lorum Ipsum Here");
$sections = array();
$x = 1;

//Start Splitting
while( $x ) {

    //Get current length of $chapter
    $len = strlen($chapter);

    //If $chapter is longer than 1000 characters
    if( $len > 1000 ) {

        //Get Position of last space character before 1000
        $x = strrpos( substr( $chapter, 0, 1000), " ");

        //If $x is not FALSE - Found last space
        if( $x ) {

            //Add to $sections array, assign remainder to $chapter again
            $sections[] = substr( $chapter, 0, $x );
            $chapter = substr( $chapter, $x );

        //If $x is FALSE - No space in string
        } else {

            //Add last segment to $sections for debugging
            //Last segment will not have a space. Break loop.
            $sections[] = $chapter;
            break;
        }

    //If remaining $chapter is not longer than 1000, simply add to array and break.
    } else {
        $sections[] = $chapter;
        break;
    }
}
print_r($sections);

Edit:

  • Tested with 5k Words (33K bytes) In a fraction of a second. Divided the text up into 33 segments. (Whoops, I had it set to divide into 10K character segments, before.)

  • Added verbose comments to code, as to explain what everything does.

Jason
  • 1,987
  • 1
  • 14
  • 16
0

You are always reading the $chapter from the start. You should delete the already read characters from $chapter so you will never read much more than 10000 characters. If you do this, you must also tweak the cycles.

Lajos Arpad
  • 64,414
  • 37
  • 100
  • 175
0

try

set_time_limit(240);

at the begining of the code. (this is the ThrowSomeHardwareAtIt aproach )

0

It can be done in just one single line, wich speeds up your code a lot.

echo $segment = substr($chapter, 0, strpos($chapter, " ", 1000));

It wil take the substring of the chapter until 1000 + some characters until the first space.

Christiaan
  • 183
  • 1
  • 12
0

Here is a simple function to do that

$chapter = "Your full chapter";
breakChapter($chapter,1000);

function breakChapter($chapter,$size){
    do{
       if(strlen($chapter)<$size){
           $segment=$chapter;
           $chapter='';
       }else{
           $pos=strpos($chapter,' ', $size);
           if ($pos==false){
               $segment=$chapter;
               $chapter='';
           }else{
               $segment=substr($chapter,0,$pos);
               $chapter=substr($chapter,$pos+1);
           }
       }
       echo $segment. "\n";
    }while ($chapter!='');
}

checking each character is not a good option and is resource/time intensive

PS: I have not tested this (just typed in here), and this may not be the best way to do this. but the logic works!

bansi
  • 55,591
  • 6
  • 41
  • 52
  • where are you going to display the result? in browser? console? what is your character set? You can try [htmlspecialchars](http://php.net/manual/en/function.htmlspecialchars.php) or [htmlentities](http://php.net/manual/en/function.htmlentities.php) if you want the output to browser. – bansi Aug 29 '13 at 03:20
  • to know the difference between htmlspecialchars and htmlentities check this link http://stackoverflow.com/questions/46483/htmlentities-vs-htmlspecialchars – bansi Aug 29 '13 at 03:22
  • Is there anyway to preserve the original formating? indentations, new lines, breaks. – Jon Le Aug 30 '13 at 22:44
  • still don't know where the result is displayed and assuming it is a browser. you can replace replace newline `"\n"` with `
    ` and if you want multiple spaces to be preserved you can replace multi spaces with ` `. also check [nl2br()](http://php.net/manual/en/function.nl2br.php)
    – bansi Aug 31 '13 at 04:14
  • http://www.barbaraelsborg.com/new/read.php?ID=30 So, I guess I'm wondering how to storeand display the original format. I could manually add in
    but /n isn't already included.
    – Jon Le Aug 31 '13 at 16:05
  • try `echo '
    '; breakChapter($chapter,1000); echo '
    ';` this will preserve your text format, but will be more headache to get things right on the page. the pages look fine to me, it will be better if you add some padding for `
    ` element.
    – bansi Aug 31 '13 at 16:18