4

I've got Amazon s3 integrated with my hosting account at WP Engine. Everything works great except when it comes to files with + characters in them.

For example in the following case when a file is named: test+2.pdf http://support.mcsolutions.com/wp-content/uploads/2011/11/test+2.pdf = does not work.

The following URL is the amazon URL. Notice the + charcter is encoded. Is there a way to prevent/change this? http://mcsolutionswpe.s3.amazonaws.com/mcsupport/wp-content/uploads/2011/11/test%2b2.pdf

Other URLs work fine: Amazon -> http://mcsolutionswpe.s3.amazonaws.com/mcsupport/wp-content/uploads/2011/11/test2.pdf Website -> http://support.mcsolutions.com/wp-content/uploads/2011/11/test2.pdf

Justin W Hall
  • 371
  • 1
  • 5
  • 21

1 Answers1

4

If I understand your question correctly, then no, there is no way to really change this.

The cause appears to be an unfortunate design decision made on S3 many years ago -- which, of course, cannot be fixed, now, because it would break too many other things -- which involves S3 using an incorrect variant of URL-escaping (which includes but is not quite limited to "percent-encoding") in the path part of the URL, where the object's key is sent.

In the query string (the optional part of a URL after ? but before the fragment, if present, which begins with #), the + character is considered equivalent to [SPACE], (ASCII Dec 32, Hex 0x20).

...but in the path of a URL, this is not supposed to be the case.

...but in S3's implementation, it is.

So + doesn't actually mean +, it means [SPACE]... and therefore, + can't also mean +... which means that a different expression is required to convey + -- and that value is %2B, the url-escaped value of + (ASCII Dec 43, Hex 0x2B).

When you upload your files, the + is converted by the code you're using (assuming it understands this quirk, as apparently it does) into the format S3 expects (%2B)... and so it must be requested using %2B so when you download the files.

Strangely, but not surprisingly, if you store the file in S3 with a space in the path, you can actually request it with a + or a space or even %20 and all three of these should actually fetch the file... so if seeing the + in the path is what you want, you can sort of work around the issue by saving it with a space instead, though this workaround deserves to be described as a "hack" if ever a workaround did. This tactic will not work with libraries that generate pre-signed GET URLs, unless they specifically are designed to ignore the standard behavior of S3 and do what you want, instead... but for public links, it should be essentially equivalent.

Michael - sqlbot
  • 169,571
  • 25
  • 353
  • 427