11

i use nicEdit to write RTF data in my CMS. The problem is that it generates strings like this:

hello first line<br><br />this is a second line<br />this is a 3rd line

since this is for a news site, i much prefer the final html to be like this:

<p>hello first line</p><p>this is a second line<br />this is a 3rd line</p>

so my current solution is this:

  1. i need to trim the $data for <br /> at the start/end of the string
  2. replace all strings that have 2 <br/> or more with </p><p> (one single <br /> is allowed).
  3. finally, add <p> at the start and </p> at the end

i only have steps 1 and 3 so far. can someone give me a hand with step 2?

function replace_br($data) {
 # step 1
 $data = trim($data,'<p>');
 $data = trim($data,'</p>');
 $data = trim($data,'<br />');
 # step 2 ???
 // preg_replace() ?
 # step 3
 $data = '<p>'.$data.'</p>';
 return $data;
}

thanks!

ps: it would be even better to avoid specific situations. example: "hello<br /><br /><br /><br /><br />too much space" -- those 5 breaklines should also be converted to just one "</p><p>"

final solution (special thanks to kemp!)

function sanitize_content($data) {
    $data = strip_tags($data,'<p>,<br>,<img>,<a>,<strong>,<u>,<em>,<blockquote>,<ol>,<ul>,<li>,<span>');
    $data = trim($data,'<p>');
    $data = trim($data,'</p>');
    $data = trim($data,'<br />');
    $data = preg_replace('#(?:<br\s*/?>\s*?){2,}#','</p><p>',$data);
    $data = '<p>'.$data.'</p>';
    return $data;
}
Andres SK
  • 10,779
  • 25
  • 90
  • 152

3 Answers3

17

This will work even if the two <br>s are on different lines (i.e. there is a newline or any whitespace between them):

function replace_br($data) {
    $data = preg_replace('#(?:<br\s*/?>\s*?){2,}#', '</p><p>', $data);
    return "<p>$data</p>";
}
Matteo Riva
  • 24,728
  • 12
  • 72
  • 104
3

This approach will solve your problem:

  1. Split the string on <br> or <br />: you'll get an array of strings.
  2. Create a new string <p>.
  3. Loop on the array of 1, from the beginning to the end and remove all entries that are empty, until an entry that is not empty (break).
  4. Same as 3, but from the end to the beginning of the array.
  5. Loop on the array of 1, have an integer value A (default 0), which states that there is a single or double break.
    1. If the string is empty, increase the value of A and continue the loop.
    2. If the string is not empty:
      1. If the value of A is 1 or below, append a <br>.
      2. If the value of A is 2 or above, append a </p><p>.
    3. Append the content of the current entry (which is not empty).
    4. Set the value of A to 0.
  6. Append </p>

A different approach: using Regular Expressions

(<br ?/?>){2,}

Will match 2 or more <br>. (See php.net on preg_split on how to do this.)

Now, the same approach on step 2 and 3: loop on the array twice, once from the beginning up (0..length) and once from the end down (length-1..0). If the entry is empty, remove it from the array. If the entry is not empty, quit the loop.

To do this:

$array = preg_split('/(<br ?/?>\s*){2,}/i', $string);

foreach($i = 0; $i < count($array); $i++) {
    if($value == "") {
        unset($array[$i]);
    }else{
        break;
    }
}

foreach($i = count($array) - 1; $i >= 0; $i--) {
    if($value == "") {
        unset($array[$i]);
    }else{
        break;
    }
}

$newString = '<p>' . implode($array, '</p><p>') . '</p>';
Pindatjuh
  • 10,550
  • 1
  • 41
  • 68
  • actually it would be even better if there was a way to find a string with 2 or more
    -- im thinking on preg_replace but still havent an idea on how to continue.
    – Andres SK Jun 02 '10 at 17:00
  • The first approach also handles those. The second approach is more easy to implement, but the question is whether you like to use RegEx on HTML (some people don't like that approach). – Pindatjuh Jun 02 '10 at 17:05
  • thanks for the pattern, but i think something is wrong. im using: $data = preg_replace('(
    ){2,}','aaa',$data); and it returns null. why? (im using "aaa" to make it more visible once applied)
    – Andres SK Jun 02 '10 at 17:11
  • Because you use `preg_replace`; you may ofcourse use that, but it will not work in the situation I sketched. I've also added some code. – Pindatjuh Jun 02 '10 at 17:13
0

I think this should work for step #2 unless I am not understanding your scenario completely:

$string = str_replace( '<br><br>', '</p><p>', $string );
$string = str_replace( '<br /><br />', '</p><p>', $string );
$string = str_replace( '<br><br />', '</p><p>', $string );
$string = str_replace( '<br /><br>', '</p><p>', $string );
OneNerd
  • 6,442
  • 17
  • 60
  • 78
  • thanks for the idea, but it is to basic. i need a more advanced approach. check out the final solution on the top. – Andres SK Jun 02 '10 at 17:57