6

I love Sanitize. It's an amazing utility. The only issue I have w/ it is the fact that it takes forever to prepare a development environment w/ it because it uses Nokogiri, which is a pain for compile time. Are there any programs that do what Sanitize does (if nothing else than mildly what it does) w/out using Nokogiri? This would help exponentially!

Kick Buttowski
  • 6,709
  • 13
  • 37
  • 58
T145
  • 1,415
  • 1
  • 13
  • 33
  • Why not just prep Nokogiri differently? – user2864740 Dec 03 '13 at 01:30
  • Because Nokogiri wraps libxml2, which is the reference-implementation of XML everywhere except, you know... – Phlip Dec 03 '13 at 03:46
  • Downvote cancelled because this is a real question with the potential for a good answer. The lame answer is to rewrite the core of Sanitize using REXML and its XPath. One could iterate over HTML, compare every tag name to a white-list, and kill every tag not on the white list. This will be slow; if that becomes an issue then install VirtualBox, install Ubuntu Saucy Salamander on it, install RVM on that, and let that Nokogiri roar. – Phlip Dec 03 '13 at 03:49
  • If I understand correctly you have a command-line ruby script that does the Sanitize work. If so, why not instead wrap it in a simple Sinatra, etc, service and POST your requests? – Ken Y-N Dec 03 '13 at 04:15
  • If this is a serious pain point, you could try preparing a [Vagrant box](http://www.vagrantup.com/) where everything's already compiled for you. – tadman Dec 03 '13 at 04:21

1 Answers1

2

Rails has its own SanitizeHelper.

According to http://api.rubyonrails.org/classes/ActionView/Helpers/SanitizeHelper.html, it will

This sanitize helper will html encode all tags and strip all attributes that aren’t specifically allowed.

It also strips href/src tags with invalid protocols, like javascript: especially. It does its best to counter any tricks that hackers may use, like throwing in unicode/ascii/hex values to get past the javascript: filters. Check out the extensive test suite.

You can use it in a view like so

<%= sanitize @article.body %>

You can visit the link to see more customizing options like:

Custom Use (only the mentioned tags and attributes are allowed, nothing else)

<%= sanitize @article.body, tags: %w(table tr td), attributes: %w(id class style) %>
Community
  • 1
  • 1
Zero Fiber
  • 4,417
  • 2
  • 23
  • 34
  • Could you provide a link to some example usages of this in a normal class instance? My current interpretation of what you're saying is that a line of code for using SanitizeHelper would look like "SanitizeHelper::sanitize(html, options)" as opposed to normal Sanitize (after a require statement): "Sanitize.clean(html, options)". – T145 Dec 04 '13 at 14:08
  • @T145 If by usage in class instance, you mean using sanitize in the controller, then you can see this question - http://stackoverflow.com/questions/3985989/using-sanitize-within-a-rails-controller – Zero Fiber Dec 04 '13 at 14:12
  • Thanks a lot, this has helped incredibly! I went ahead and replaced all of my `Sanitize.clean` methods w/ `ActionController::Base.helpers.sanitize` and now everything works brilliantly! – T145 Dec 04 '13 at 14:16
  • Just as a quick followup, if I'm not calling from a controller (e.g. from a lib component), what would I use? Using that method from outside a controller throws exceptions. – T145 Dec 04 '13 at 17:02
  • @TI45 You can include the Helper and it should work - `include ActionView::Helpers::SanitizeHelper` – Zero Fiber Dec 04 '13 at 17:14
  • There is yet another problem I am having. At first it started working just fine, but then I discovered that some HTML that was desired to be simply sanitized was sanitized to extinction. Is there a way to resolve this? – T145 Dec 05 '13 at 15:29
  • @T145 Can you explain "sanitized to extinction" with an example? – Zero Fiber Dec 05 '13 at 16:17
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/42600/discussion-between-sampriti-panda-and-t145) – Zero Fiber Dec 05 '13 at 16:31