get original dom element innerHTML without javascript processing

Question

Background - in an article editor powered by TinyMCE for an enterprise in-house CMS behind large media site/s

HTML

<p>non-breaking-space: &nbsp; pound: &pound; copyright: &copy;</p>

JS

console.log($('p').html());
console.log(document.getElementsByTagName('p').item(0).innerHTML);

both return

non-breaking-space: &nbsp; pound: £ copyright: ©

when I'm expecting

non-breaking-space: &nbsp; pound: &pound; copyright: &copy;

some elements get their entities reversed (like pound and copyright), and some are preserved (non-breaking space). I need a way to get the original inner HTML, all preserved, not one that is processed by the browser; is that possible?

This is for a TinyMCE plugin which processes input using jQuery and puts it back. The content is loaded via a database, the plugin is processing image tags did not want to modify the text content at all. The automatic change of some entities back to the raw characters wouldn't be too much of a problem, but -

We cannot modify editorial's input, even if it were minor
We enforce that these must be entities before they save due to some browser compatibility issues on our sites

I would use this answer - https://stackoverflow.com/a/4404544/830171 - however cannot as my HTML code is within a textarea that the user needs to edit and that I need to run jQuery DOM manipulation on (via the plugin).

One way I can think of is not use jQuery/DOM to process the image tags I need to change, but to use regex like a lot of TinyMCE plugins do; but since I was shot down in regex to pull all attributes out of all meta tags for attempting any regex on HTML, was hoping for a better way!

A `console.dir` of an element with such text doesn't show any properties with the entities preserved. Even the debugger (in Chrome) shows all elements' HTML without entities preserved, so I guess you're out of luck. — pimvdb, Jan 16 '13 at 19:18

score 1 · Answer 1 · edited Dec 04 '17 at 16:54

1

Tinymce uses a contenteditable iframe to edit the content. That's the reason why console.log($('p').html()); will log something else.

Use the following code to get the pure editor content:

tinymce.get('your_editor_id').getBody().innerHTML

edited Dec 04 '17 at 16:54

Gogol

3,033
4
28
57

answered Jan 16 '13 at 16:22

Thariama

50,002
13
138
166

I wouldn't focus too much on the TinyMCE part of the question, but this in general how to get back the original HTML, here shows the same problem specific to the TinyMCE plugin - `ed.onPostProcess.add( function(ed, o) { console.log(o.content); // outputs £ console.log($('' + o.content + '').html()); // outputs £` – gingerCodeNinja Jan 16 '13 at 17:19

get original dom element innerHTML without javascript processing

1 Answers1