3

I found this code which will match at most 300 chars, then break at the next nearest word-break:

 $var = 'This is a test text 1234567890 test check12.' # 44 chars
 preg_match('/^.{0,300}(?:.*?)\b/iu', $var, $matches);
 echo $matches[0];

44 is lower than 300, so I expect the output to be the same like $var.

But the output is:

 This is a test text 1234567890 test check12   # 43 chars

$matches[0] is not giving me the dot at the end, however $var does. Anyone can tell me how to get the full string (with the dot)?

creativz
  • 10,369
  • 13
  • 38
  • 35

4 Answers4

2

I could get the expected result by:

  • Removing the \b
  • Replacing \b with $

EDIT:

In your pattern the dot at the end of the string is acting as a word boundary, so you are able to match everything before the dot. If you put a .* after the \b , you'll see that it will match the dot.

See this for more info on how word boundaries in regex work.

codaddict
  • 445,704
  • 82
  • 492
  • 529
2

Using preg_match to break at 300 chars seems like a bad idea. Why don't you just use:

substr($var, 0, strpos($var, ' ', 300));

That will give you the first 300 chars broken at the next whitespace without using regular expressions.

thetaiko
  • 7,816
  • 2
  • 33
  • 49
1
'/^.{300}(?:.*?)\b|^.*{0,300}/u'

I'm not sure why you want this though. Here is my answer to a similar question, but cutting at the previous nearest space.

Community
  • 1
  • 1
Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452
0

In your

(?:.*?)

You should get rid of the * I think. This means that it must match at least once, but up to infinite times. So you wil find that your period is in the second match.

TO be honest, I would just use the pattern

 preg_match('/^(.){0,300}\b/iu', $var, $matches);
Layke
  • 51,422
  • 11
  • 85
  • 111