0

Here us a point i am stuck again using regular expression with PHP preg_split() function.

Here is the code :

preg_split('~("[^"]*")|[!?.।]+\s*|\R+~u', $paragraph, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);

I am trying to split a paragraph into sentences. This code does the job for me.
here is a link to my previous question

But, now I need to keep the punctuation intact(the question marks, full stop etc.).

using the PREG_SPLIT_DELIM_CAPTURE is supposed to have done that job but somehow it's not working that way. I get only sentences, without the full-stop or question marks.

Alexandre Elshobokshy
  • 10,720
  • 6
  • 27
  • 57
Prashanth Benny
  • 1,523
  • 21
  • 33

1 Answers1

1

Your requirement doesn't need PREG_SPLIT_DELIM_CAPTURE. It's helpful when you need them to be returned as individual matches. In this case you need \K:

<?php

var_dump(preg_split('~("[^"]*")|[!?.।]+\K\s*|\R+~u', <<<STR
hello! how are you? how is life
live life, live free. "isnt it?"
STR
, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY));

Output:

array(5) {
  [0]=>
  string(6) "hello!"
  [1]=>
  string(12) "how are you?"
  [2]=>
  string(11) "how is life"
  [3]=>
  string(21) "live life, live free."
  [4]=>
  string(10) ""isnt it?""
}
revo
  • 47,783
  • 14
  • 74
  • 117