0

I have a string that dynamically comes from another document as follows;

"<!DOCTYPE html>
<html dir="ltr"><head><title>Preview</title></head>
<body>
<p>test</p>
<p><img alt="" height="299" src="http://172.0.0.1/Administration/YDImages/cap.JPG" width="696"></p>
</body>
</html>"

I use this string as follows;

var html = stringAbove; 
var reg = html.match(/<body[^>]*>(.*)<\/body>/);
var newDocument = "<p>My new Texts and styles</p>"; //replace inside body with my new code
var newer = html.replace(reg[1],newDocument);
doc.write(newer);

I've discovered that html.match returns null if the string inside html variable as it is above, while debugging to see how could I make this regex work on developer tools, I've changed starting and ending double quotes of the string to single quotes, so it worked. Then I changed all double quotes to single quotes and try regex function but it doesn't work. Please, help me to get this regex work properly.

ibubi
  • 2,469
  • 3
  • 29
  • 50
  • 1
    I suggest not using regex to parse html [Here](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) is [why](http://stackoverflow.com/questions/590747/using-regular-expressions-to-parse-html-why-not) – George Apr 03 '17 at 13:14
  • Thanks for the info, i will research what I can do. – ibubi Apr 03 '17 at 13:41

1 Answers1

0

You can try something like this:

/<body>(.*?)(?=<\/body>)/

This will start matching from <body> till the character followed by </body>.

Also since you are receiving HTMLString, you will not have multiple bodys and hence using match[0]

var s = '<!DOCTYPE html><html dir="ltr"><head><title>Preview</title></head><body><p>test</p><p><img alt="" height="299" src="http://172.0.0.1/Administration/YDImages/cap.JPG" width="696"></p></body></html>';

var regex = /<body>(.*?)(?=<\/body>)/;

var match = s.match(regex)
console.log(match)
var html = match[0].replace("<body>", "")

document.querySelector('.content').innerHTML = html
img{
  border: 1px solid gray;
}
<div class="content"></div>
Rajesh
  • 24,354
  • 5
  • 48
  • 79
  • Thanks for post, indeed my problem is the `s` variable comes from another document as string, I get this string and apply the same code as you post, however match is `null`. I don't know why, but if I change the wrapping qoutes of the variable to single quotes on runtime (while debugging) match occurs. – ibubi Apr 03 '17 at 13:40
  • @ibubi Try logging raw string that you are getting. If it starts with `"`, then string would be only `" – Rajesh Apr 03 '17 at 13:46
  • @ibubi Also, if the issue is with your HTMLString and your code works fine, just drop a comment and I'll remove the answer as its not required. – Rajesh Apr 03 '17 at 13:48
  • Exactly it is.. string starts with `"`, code works fine. How can I get the complete html string without broken? – ibubi Apr 03 '17 at 13:51
  • @ibubi for that we will need to see how are you fetching HTML string right now. You can try `html.replace(/"/g, "\"")` to escape it, but we can help better if that piece of code is available. You can also check [encodeURI](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURI) – Rajesh Apr 03 '17 at 13:53
  • this string comes from its parent window(a variable in there) that I am not allowed to make any change. I get this string via `var html = window.opener.htmlToLoad` – ibubi Apr 03 '17 at 13:58
  • Try `var html = window.opener.htmlToLoad.replace(/"/g, "\"")` – Rajesh Apr 03 '17 at 14:01
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/139778/discussion-between-ibubi-and-rajesh). – ibubi Apr 03 '17 at 14:06