1

I have this html content

<!DOCTYPE html>
<html>
<head>
</head>
<body>
<center><img style="margin-left: auto; display: block; margin-right: auto;" src="../img/escudocaba.png" alt="#" width="470" height="90" /></center>
<h1>Hoddda</h1>
</body>
</html>

I need to extract the content of the body without including the body tags. I have made this regex that match perfectly:

/<body[^>]*>(.*?)<\/body>/is

as you can see in this website

https://regex101.com/

But when I use it

var bodyHtml=$editor.val().match( /<body[^>]*>(.*?)<\/body>/is);

I get no results.......Also tried a similar regex which did not work out in the end with the same format and without modifiers and it was matching

   var bodyHtml=$editor.val().match(/(?:.(?!<\s*body[^>]*>))*\>.+/);

for example returned

<center><img style="margin-left: auto; display: block; margin-right: auto;" src="../img/escudocaba.png" alt="#" width="470" height="90" /></center>

How can I do in this case to use regex modifiers with this jquery function?. Thanks

JAF
  • 350
  • 3
  • 19
  • 1
    You can simply do : $('body').html() to get the contents of the body without the body tag – DinoMyte Feb 10 '16 at 18:39
  • Is that... is that a `center` tag? What year is it?!? – faino Feb 10 '16 at 18:53
  • Oh yeah, and this: http://stackoverflow.com/a/1732454/1232175, just use `$('body').html();` as Dino suggested. – faino Feb 10 '16 at 18:55
  • that's no good...I need the body of the editor canvas, not the body of the whole page – JAF Feb 11 '16 at 13:48
  • The html structure I was talking(the one which is inside a textarea with id="#plantillaEditor") about is inside the main html of the whole page...... – JAF Feb 11 '16 at 13:57

2 Answers2

1

In javascript flavour of regex the . doesn't match new lines.

To solve that you can use [^] or [\s\S] or [\d\D] or (?:.|\n) ...

Try this code:

var bodyHtml = $editor.val().match(/<body[^>]*>([\s\S]*?)(<\/body>|$)/i)[1];
  • this doesn't work, it returns "undifined" – JAF Feb 11 '16 at 13:52
  • @JAF. I suspect just `$editor.find('body').html()` should work –  Feb 11 '16 at 14:11
  • It's not working my friend.....If I put console.log(($('textarea').val())); I get
    #

    Hodddaaa

    – JAF Feb 11 '16 at 14:19
  • @JAF. Last try: `$($editor.val()).find('body').html()`. Otherwise, just change the `.` in the regex by one of the options I said in my answer. –  Feb 11 '16 at 14:34
  • That didn't work, I already tried that. Regarding replacing the point, what are you refering to?...what would be the final expression? – JAF Feb 11 '16 at 14:41
  • @JAF. I saw the problem now... when you try to create the string as DOM it loads everything including the images' source... I would persist on regex and maybe use `/]*>([\s\S]*?)(<\/body>|$)/` –  Feb 11 '16 at 15:03
  • almost....if I use that regex the result is
    #

    Hodddaaa

    I need the text without the body tags
    – JAF Feb 11 '16 at 15:49
  • @JAF. I've updated my answer –  Feb 11 '16 at 16:33
  • 1
    this is the correct answer, thank you very much my friend – JAF Feb 11 '16 at 16:41
0

The s-modifier doesn't exist in JS. try using [\s\S] instead of the dot, as suggested here: Javascript regex multiline flag doesn't work

like this:

<body[^>]*>([\s\S]*?)<\/body>
Community
  • 1
  • 1
Neutrosider
  • 562
  • 3
  • 12