1

I'm using an extremely ugly function in jQuery to listen on a paste event and remove all extraneous HTML formatting from the paste. Unfortunately, what I have now is overly strict, on top of just being butt-ugly.

I want to improve this regex to allow for the same HTML I'm already allowing inside the WYSIWYG editor. This means I would like to have <b>, <i>, <br>, and <a> tags allowed.

I do not know enough regex to do this myself, and would be very appreciative to see this improved.

$('iframe').ready(function() {
  $(this).contents().find('.wysiwyg').find('iframe').contents()
  .find('.wysiwyg').bind('paste', function() {
    var el = $(this);
    setTimeout(function() {
        var strClean = el.text().replace(/<\/?[^>]+>/gi, '');
        el.text(strClean);
    }, 0);
  });
});

You can see this ugly code at work here: http://jsfiddle.net/v4LhV/3/

Alan Moore
  • 73,866
  • 12
  • 100
  • 156
Josh Smith
  • 14,674
  • 18
  • 72
  • 118
  • 5
    Here is a fairly [popular discussion](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) on the subject .. (*the gist of it being - you cannot parse (x)html with regex*) – Gabriele Petrioli Nov 10 '10 at 09:38
  • Excellent comment. This really should be posted as an answer, though. – Josh Smith Nov 10 '10 at 10:54

1 Answers1

2

As you have a fully functional DOM parser, namely the browser, at hand, why not just parse the whole thing using .html() in to an element (off screen), then run through removing stuff you do not want using .unwrap().

Orbling
  • 20,413
  • 3
  • 53
  • 64
  • @John Smith, can you please provide the piece of work that solved your problem. I am having the same problem. – learning Oct 04 '11 at 07:31
  • @learning - It's Josh Smith, not John Smith you're after and best to post such a comment on the question itself, as this answer is more a suggestion of techniques than an implementation of them. Of course you could follow the same suggestions... – Orbling Oct 05 '11 at 14:53