0

So basically I hard an extremely large string, and I would just like to save the first 4 words of it.

I ALMOST had this working, although there are some cases that break it.

Here is my current code:

$title = "blah blah blah, long paragraph goes here";
//Make title only have first 4 words
$pieces = explode(" ", $title);
$first_part = implode(" ", array_splice($pieces, 0, 4));
$title = $first_part;
//title now has first 4 words

The main cases that break it are line-breaks. If I have a paragraph like this:

Testing one two three
Testing2 a little more three two one

The $title would be equal to Testing one two three Testing2

Another example:

Testing
test1
test2
test3
test4
test5
test6
sdfgasfgasfg fdgadfgafg fg

Title would equal = Testing test1 test2 test3 test4 test5 test6 sdfgasfgasfg fdgadfgafg fg

For some reason it is grabbing the first word on the next line aswel.

Does anyone have any suggestions to how to fix this?

Fizzix
  • 23,679
  • 38
  • 110
  • 176

3 Answers3

1

Try this:

function first4words($s) {
    return preg_replace('/((\w+\W*){4}(\w+))(.*)/', '${1}', $s);    
}

https://stackoverflow.com/a/965343/2701758

Community
  • 1
  • 1
1

It might be a bit hacky but I would try just using a str_replace() to get rid of any line-breaks.

$titleStripped = str_replace('\n', ' ', $title);
$pieces - explode(' ', $title);

Depends on your application and expected data though. If you're expecting more than line breaks, go with a preg_replace. Either way, prep the data before exploding.

Matt Berg
  • 26
  • 1
  • Good idea, although the data can be basically anything. Could even have 5 line-breaks before the next word. – Fizzix Sep 20 '13 at 00:49
  • Actually, this could work quite well if I use my method, but then replace the line-breaks with spaces, then use my method again – Fizzix Sep 20 '13 at 00:54
  • Ended up using your logic for my answer and I created a `preg_match` that replaces any linebreaks. For those who want to know how I solved it, I place this at the beginning of my code: `$title = preg_replace( "/\r|\n/", " ", $title);` – Fizzix Sep 20 '13 at 00:58
0

Try this (untested code):

//--- remove linefeeds
$titleStripped = str_replace('\n', ' ', $title);
//--- strip out multiple space caused by above line
preg_replace('/ {2,}/g',$titleStripped );
//--- make it an array
$pieces = explode( ' ', $titleStripped );
//--- get the first 4 words
$first_part = implode(" ", array_splice($pieces, 0, 4));