12

Why doesn't URI.escape escape single quotes?

URI.escape("foo'bar\" baz")
=> "foo'bar%22%20baz"
John Bachir
  • 22,495
  • 29
  • 154
  • 227
  • 1
    Because a single quote is a legal URI character. http://stackoverflow.com/questions/1547899/which-characters-make-a-url-invalid – Dave Newton Oct 11 '11 at 20:17
  • Well.. they are reserved though, meaning they are allowed but have special syntactical meaning. In this case I don't want them to have special meaning, it's data a user entered and it should not be interpreted as syntax but as data by the browser and by the web application and all the layers in between. So I guess the real question is, what does "escape" mean... – John Bachir Oct 11 '11 at 21:33
  • 1
    According to [the docs](http://apidock.com/ruby/URI/Escape/escape) it'll escape "unsafe" chars as defined by `REGEX::UNSAFE`. You can pass in your own. – Dave Newton Oct 11 '11 at 21:45
  • FWIW I'm here because AWS Cloudfront expects single quotes in URIs to be escaped to `%27`, when you try to run an invalidation. So, it's legal to some and not legal to others I guess. – Max Williams Mar 28 '17 at 09:36

4 Answers4

11

For the same reason it doesn't escape ? or / or :, and so forth. URI.escape() only escapes characters that cannot be used in URLs at all, not characters that have a special meaning.

What you're looking for is CGI.escape():

require "cgi"
CGI.escape("foo'bar\" baz")
=> "foo%27bar%22+baz"
molf
  • 73,644
  • 13
  • 135
  • 118
  • 2
    I'm not sure how that's at all helpful. CGI escape doesn't do the same things that URI escape did, and they aren't interchangeable. – cbmanica Jan 11 '13 at 19:00
  • 1
    @cbmanica. This is true, but most people who use this library are actually looking to URL-encode a string. See [this excellent answer](http://stackoverflow.com/a/13059657/182590) for a great rundown on the alternatives. – Mark Thomas Mar 09 '14 at 20:28
4

This is an old question, but the answer hasn't been updated in a long time. I thought I'd update this for others who are having the same problem. The solution I found was posted here: use ERB::Util.url_encode if you have the erb module available. This took care of single quotes & * for me as well.

CGI::escape doesn't escape spaces correctly (%20) versus plus signs.

Community
  • 1
  • 1
Foo L
  • 10,977
  • 8
  • 40
  • 52
1

According to the docs, URI.escape(str [, unsafe]) uses a regexp that matches all symbols that must be replaced with codes. By default the method uses REGEXP::UNSAFE. When this argument is a String, it represents a character set.

In your case, to modify URI.escape to escape even the single quotes you can do something like this ...

reserved_characters = /[^a-zA-Z0-9\-\.\_\~]/
URI.escape(YOUR_STRING, reserved_characters)

Explanation: Some info on the spec ...

All parameter names and values are escaped using the [rfc3986] percent- encoding (%xx) mechanism. Characters not in the unreserved character set ([rfc3986] section 2.3) must be encoded. characters in the unreserved character set must not be encoded. hexadecimal characters in encodings must be upper case. text names and values must be encoded as utf-8 octets before percent-encoding them per [rfc3629].

King'ori Maina
  • 4,440
  • 3
  • 26
  • 38
  • URI.escape is deprecated. Even though it still exists, it's best to use one of the other solutions posted here. – Mark Thomas Mar 09 '14 at 20:19
  • @MarkThomas I agree. I put this here for archive purposes to elaborate how to modify the regex as it was mentioned in passing. – King'ori Maina Mar 11 '14 at 08:11
0

I know this has been answered, but what I wanted was something slightly different, and I thought I might as well post it up: I wanted to keep the "/" in the url, but escape all the other non-standard characters. I did it thus:

#public filename is a *nix filepath, 
#like `"/images/isn't/this a /horrible filepath/hello.png"`

public_filename.split("/").collect{|s| ERB::Util.url_encode(s)}.join("/")
=> "/images/isn%27t/this%20a%20/horrible%20filepath/hello.png"

I needed to escape the single quote as I was writing a cache invalidation for AWS Cloudfront, which didn't like the single quotes and expected them to be escaped. The above should make a uri which is more safe than the standard URI.escape but which still looks like a URI (CGI Escape breaks the uri format by escaping "/").

Max Williams
  • 32,435
  • 31
  • 130
  • 197