5

Is it possible to pass HTML to a browser through JavaScript and parse it with jQuery, but not load external resources? (scripts, images, flash, anything)

I will do with the XML parser if that is the best I can do, but I would like to allow loose HTML if possible.

It must be compatible with Chrome, Firefox, the latest IE.

700 Software
  • 85,281
  • 83
  • 234
  • 341
  • I solved the issue.. you can adapt this into replacing the src for any tags http://stackoverflow.com/questions/6671461/replace-img-elements-src-attribute-in-regex – samccone Jul 12 '11 at 22:48
  • Since I do not control the source HTML, and there are too many tricky hacks, I cannot accept a regex answer. Sorry. Additionally, it makes sense that scripts should not be executed since they may decide to load external resources on their own. – 700 Software Jul 13 '11 at 00:43
  • 1
    2021 - ten years later I'm acing the same problem. Using document.createElement will load resources like images in the background.. I use a temporary DOMParser to avoid this (https://developer.mozilla.org/en-US/docs/Web/API/DOMParser). – A.J.Bauer Jul 12 '21 at 06:09
  • This is a great alternative! Answer? – 700 Software Jul 19 '21 at 22:21

1 Answers1

1
var html = someHTML; //passed in html, maybe $('textarea#id').val();? I don't understand what you mean by 'passed in html'
var container = document.createElement('div');
container.innerHTML = html;
$(container).find('img,embed,head,script,style').remove();
//or
$(container).find('[src]').remove();

var target = someTarget; //place to put parsed html
$(container).appendTo($(target));

EDIT

Tested working

removeExt = function(cleanMe) {
    var toScrutinize = $(cleanMe).find('*'); //get ALL elements
    $.each(toScrutinize, function() {
      var attr = $(this)[0].attributes; //get all the attributes
      var that = $(this); 
      $.each(attr, function(){
          if ($(that).attr(this.nodeName).match(/^http/)) {//if the attribute value links externally
           $(that).remove(); //...take it out  
          } 
      })
    })
    $('script').remove(); //also take out any inline scripts
}

var html = someHTML;
var container = document.createElement('div');
container.innerHTML = html;
removeExt($(container));
var target = someTarget;
$(container).appendTo($(target));

This will match src, href, link, data-foo, whatever... No way to link externally. http and https are both matched. inline scripts are killed. If it's still a security concern, then maybe this should be done server side, or obfuscate your JS.

Kyle Macey
  • 8,074
  • 2
  • 38
  • 78