4
var ex = /(<script\s?[^>]*>)([\s\S]*)(<\/script>)/;
//Note: here is 2 script tags
var str = '<script>\nvar x=0;\n</script>\n<div>\nhtml\n</div>\n<script>var y=0;\n</script>'
str.replace(ex, function(full, prefix, script, suffix) {
    return prefix + doSomething(script) + suffix;
})

But I got wrong script: var x=0;</script><div>..</div><script>var y=0;

What I want is: var x=0; and var y=0;

guilin 桂林
  • 17,050
  • 29
  • 92
  • 146

2 Answers2

20

Use regex like below:

<script>([\s\S]*?)</script>

In Javascript we cannot make . dotall so we use the [\s\S] character class which matches any character either whitespace or not whitespace including newline. ? is for non-greedy match so that you don't nest script tags.

manojlds
  • 290,304
  • 63
  • 469
  • 417
  • ridgerunner is also correct, but you explain what `?` does, thanks. – guilin 桂林 Apr 30 '11 at 04:15
  • @guilin 桂林 - I had already mentioned that. `?` means in that case *? matches 0 or more of the preceeding token and will match as few characters as possible before satisfying the next token. So something like `` won't match in whole. – manojlds Apr 30 '11 at 04:22
  • 1
    No need for capture groups around start and end tags. Also should allow for attributes within the start tag. Otherwise, much better without the lookbehind. +1 – ridgerunner Apr 30 '11 at 04:23
3

This function matches SCRIPT elements contents and returns the strings in an array:

// Return an array of <script> elements contents. 
function getScriptsConntents(text) {
    var scripts = [];
    var m;
    var re = /<script[^>]*>([\s\S]*?)<\/script>/ig;
    while (m = re.exec(text)) {
        scripts.push(m[1]);
    }
    return scripts;
}
ridgerunner
  • 33,777
  • 5
  • 57
  • 69