14

How can I remove the (//<![CDATA[ , //]]>) blocks; tags inside a script element.

<script type="text/javascript">
    //<![CDATA[
    var l=new Array();
    ..........................
    ..........................
    //]]>
</script>

Looks like it can be done with preg_replace() but havent found a solution that works for me.

What regex would I use?

alex
  • 479,566
  • 201
  • 878
  • 984
bomanden
  • 314
  • 1
  • 2
  • 16
  • 5
    Just curious why you want to remove those two lines? – Jonathan M Nov 27 '11 at 04:16
  • bomanden: @JonathanM is right, you may not need to remove these elements. See [When is a CDATA section necessary within a script tag?](http://stackoverflow.com/questions/66837/when-is-a-cdata-section-necessary-within-a-script-tag) and [Is CDATA really necessary?](http://stackoverflow.com/questions/4215261/is-cdata-really-necessary). Think it over. – Tadeck Nov 27 '11 at 04:28
  • Ok - Its just that the Javascript dont fire .. so the code is not executed .. It is when I use Alan's solution. But thanks on the info. – bomanden Nov 27 '11 at 09:11

8 Answers8

21

You don't need regex for a static string.

Replace those parts of the texts with nothing:

$string = str_replace("//<![CDATA[","",$string);
$string = str_replace("//]]>","",$string);
dimme
  • 4,393
  • 4
  • 31
  • 51
13

The following regex will do it...

$removed = preg_replace('/^\s*\/\/<!\[CDATA\[([\s\S]*)\/\/\]\]>\s*\z/', 
                        '$1', 
                        $scriptText);

CodePad.

alex
  • 479,566
  • 201
  • 878
  • 984
  • 1
    Hi Alex .. No unfortunately not. Do know why - but got a Alans working. Perhaps you can see the difference between the two solutions. Thank you for your input. – bomanden Nov 27 '11 at 19:31
6

If you must...

$s = preg_replace('~//<!\[CDATA\[\s*|\s*//\]\]>~', '', $s);

This will remove the whole line containing each tag without messing up the indentation of the enclosed code.

Alan Moore
  • 73,866
  • 12
  • 100
  • 156
3

If <![CDATA[ contains some html special character, e.g. &, ", ', <, > and you will work with the rest of the string as it is still XML, you should escape those chars. Otherwise you will make your XML invalid.

function removeCDataFromString(string $string)
{
    return preg_replace_callback(
        '~<!\[CDATA\[(.*)\]\]>~',
        function (array $matches) {
            return htmlspecialchars($matches[1], ENT_QUOTES | ENT_SUBSTITUTE, 'UTF-8');
        },
        $string
    );
}
pulzarraider
  • 2,297
  • 19
  • 26
2

You can also try,

$s=str_replace(array("//<![CDATA[","//]]>"),"",$s);
Rohan Kumar
  • 40,431
  • 11
  • 76
  • 106
1

use str_replace() instead of preg_replace() it's lot easier

$var = str_replace('<![CDATA[', '', $var);
$var = str_replace(']]','',$var);
echo $var;
Raptor
  • 53,206
  • 45
  • 230
  • 366
Siddharth
  • 859
  • 8
  • 16
0

I use like this to remove <![CDATA[]] but on single line now work for me, dont know if for multiple line string.

preg_match_all('/CDATA\[(.*?)\]/', $your_string_before_this, $datas); 
$string_result_after_this = $datas[1][0];
Fthr
  • 769
  • 9
  • 10
0
$nodeText = '<![CDATA[some text]]>';
$text = removeCdataFormat($nodeText);    

public function removeCdataFormat($nodeText)
{
    $regex_replace = array('','');
    $regex_patterns = array(
        '/<!\[CDATA\[/',
        '/\]\]>/'
   );
   return trim(preg_replace($regex_patterns, $regex_replace, $nodeText));
}
duttyman
  • 151
  • 1
  • 2