110

I have an untrusted string that I want to show as text in an HTML page. I need to escape the chars '<' and '&' as HTML entities. The less fuss the better.

I'm using UTF8 and don't need other entities for accented letters.

Is there a built-in function in Ruby or Rails, or should I roll my own?

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
kch
  • 77,385
  • 46
  • 136
  • 148
  • 2
    [According to the OWASP](https://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet#RULE_.231_-_HTML_Escape_Before_Inserting_Untrusted_Data_into_HTML_Element_Content), the following six characters should be escaped for proper XSS protection in HTML element content: `&<>"'/` – sffc Mar 12 '14 at 09:05

8 Answers8

155

Checkout the Ruby CGI class. There are methods to encode and decode HTML as well as URLs.

CGI::escapeHTML('Usage: foo "bar" <baz>')
# => "Usage: foo &quot;bar&quot; &lt;baz&gt;"
Christopher Bradford
  • 2,220
  • 2
  • 15
  • 12
  • 13
    Thanks, this is great since it can be done from the controllers. Not that I'd do that, of course. – Dan Rosenstark Sep 02 '11 at 22:01
  • 2
    This is useful in functional/integration tests, for checking the correctness of content inserted into a template (when the content is supposed to be HTML-escaped). – Alex D Apr 15 '13 at 17:32
  • If the content is being displayed in a clients website, other then your own (where you cant control the view), whats the problem with escaping the html before inserting into the database? Is there another work around? – n00b May 11 '13 at 20:10
  • Right - escaping before entering into the database is great. You just want to make sure you don't have any old un-escaped hacks in there from before you added it... – Kevin Jun 05 '13 at 15:51
  • 5
    I like its synonym more: [CGI.escape_html](http://apidock.com/ruby/v1_9_3_392/CGI/escape_html/class) – Trantor Liu Dec 26 '14 at 06:50
99

The h helper method:

<%=h "<p> will be preserved" %>
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
Trevor Bramble
  • 8,523
  • 4
  • 29
  • 29
  • Well, it also escapes >, which is unnecessary, but it'll do. – kch Mar 28 '09 at 15:16
  • You can use parentheses to print some with h and some without. <%= h("

    " %>

    – Trevor Bramble Mar 28 '09 at 15:18
  • Now that would be silly. I don't care much if it gets escaped or not. I'm just noting it's not required per the html specs. – kch Mar 28 '09 at 15:20
  • 13
    It's *occasionally* required in XHTML due to the XML spec's rather annoying insistence that ‘]]>’ be kept out of text (see the ‘CharData’ production). This makes it generally easier (and harmless) to always escape it. – bobince Mar 28 '09 at 21:55
  • 27
    for those interested `h` is an alias for `html_escape` – lightswitch05 May 15 '14 at 23:03
  • It's defined in `ActiveSupport::SafeBuffer` https://github.com/rails/rails/blob/master/activesupport/lib/active_support/core_ext/string/output_safety.rb – Dorian Feb 23 '17 at 22:08
79

In Ruby on Rails 3 HTML will be escaped by default.

For non-escaped strings use:

<%= raw "<p>hello world!</p>" %>
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
RSK
  • 17,210
  • 13
  • 54
  • 74
29

ERB::Util.html_escape can be used anywhere. It is available without using require in Rails.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
Viktor Trón
  • 8,774
  • 4
  • 45
  • 48
  • this is actually using `CGI.escapeHTML` underneath – akostadinov Feb 03 '15 at 22:11
  • @akostadinov - the result is different however. For instance, ERB::Util.html_escape will turn apostrophes into ' whereas CGI::escapeHTML will not – Louis Sayers Aug 05 '15 at 07:00
  • @LouisSayers, I can't see how that can happen: ``` [43] pry(main)> show-source ERB::Util.html_escape From: /usr/share/ruby/erb.rb @ line 945: Owner: # Visibility: public Number of lines: 3 def html_escape(s) CGI.escapeHTML(s.to_s) end ``` – akostadinov Aug 05 '15 at 08:08
  • @akostadinov - hmm... Just ran again and yes, they produced the same output. I swear this produced different results when I ran this at work (perhaps different erb / cgi version behaviour?). I'll have to see why I got a different result at work tomorrow. – Louis Sayers Aug 05 '15 at 09:10
19

An addition to Christopher Bradford's answer to use the HTML escaping anywhere, since most people don't use CGI nowadays, you can also use Rack:

require 'rack/utils'
Rack::Utils.escape_html('Usage: foo "bar" <baz>')
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
J-_-L
  • 9,079
  • 2
  • 40
  • 37
16

You can use either h() or html_escape(), but most people use h() by convention. h() is short for html_escape() in rails.

In your controller:

@stuff = "<b>Hello World!</b>"

In your view:

<%=h @stuff %>

If you view the HTML source: you will see the output without actually bolding the data. I.e. it is encoded as &lt;b&gt;Hello World!&lt;/b&gt;.

It will appear an be displayed as <b>Hello World!</b>

Brian R. Bondy
  • 339,232
  • 124
  • 596
  • 636
13

Comparaison of the different methods:

> CGI::escapeHTML("quote ' double quotes \"")
=> "quote &#39; double quotes &quot;"

> Rack::Utils.escape_html("quote ' double quotes \"")
=> "quote &#x27; double quotes &quot;"

> ERB::Util.html_escape("quote ' double quotes \"")
=> "quote &#39; double quotes &quot;"

I wrote my own to be compatible with Rails ActiveMailer escaping:

def escape_html(str)
  CGI.escapeHTML(str).gsub("&#39;", "'")
end
Dorian
  • 22,759
  • 8
  • 120
  • 116
0

h() is also useful for escaping quotes.

For example, I have a view that generates a link using a text field result[r].thtitle. The text could include single quotes. If I didn't escape result[r].thtitle in the confirm method, the Javascript would break:

&lt;%= link_to_remote "#{result[r].thtitle}", :url=>{ :controller=>:resource,
:action         =>:delete_resourced,
:id     => result[r].id,
:th     => thread,                                                                                                      
:html       =>{:title=> "<= Remove"},                                                       
:confirm    => h("#{result[r].thtitle} will be removed"),                                                   
:method     => :delete %>

&lt;a href="#" onclick="if (confirm('docs: add column &amp;apos;dummy&amp;apos; will be removed')) { new Ajax.Request('/resource/delete_resourced/837?owner=386&amp;th=511', {asynchronous:true, evalScripts:true, method:'delete', parameters:'authenticity_token=' + encodeURIComponent('ou812')}); }; return false;" title="&lt;= Remove">docs: add column 'dummy'</a>

Note: the :html title declaration is magically escaped by Rails.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
Noddinoff
  • 93
  • 7