71

From the AWS docs, I understand that:

  • S3 key names can be any UNICODE name < 1024 chars
  • When using the GET OBJ, I need to URL encode the key name to access it.

However, these rules seem too permissive.

For instance, if I make a key called '../../d', a 400 ERROR occurs when I attempt to access it with the GET OBJECT API. Interestingly, I have no problem accessing '../d'.

Is there a document specifying what is and is not legal?

UsAaR33
  • 3,536
  • 2
  • 34
  • 55

2 Answers2

99

According to AWS S3 documentation:

Although you can use any UTF-8 characters in an object key name, the following key naming best practices help ensure maximum compatibility with other applications. Each application may parse special characters differently. The following guidelines help you maximize compliance with DNS, web safe characters, XML parsers, and other APIs.

Please find below the

Object Key Naming Guidelines from the AWS S3 official documentation


Safe characters

The following character sets are generally safe for use in key names:

  • Alphanumeric characters: 0-9 a-z A-Z
  • Special characters: ! - _ . * ' ( )

NOTE ABOUT THE DELIMITER ("/")

The following are examples of valid object key names:

  • 4my-organization

  • my.great_photos-2014/jan/myvacation.jpg

  • videos/2014/birthday/video1.wmv

Note that the Amazon S3 data model is a flat structure: you create a bucket, and the bucket stores objects. There is no hierarchy of subbuckets or subfolders; however, you can infer logical hierarchy using keyname prefixes and delimiters as the Amazon S3 console does.

e.g if you use Private/taxdocument.pdf as a key, it will create the Private folder, with taxdocument.pdf in it.

Amazon S3 supports buckets and objects, there is no hierarchy in Amazon S3. However, the prefixes and delimiters in an object key name, enables the Amazon S3 console and the AWS SDKs to infer hierarchy and introduce concept of folders.


Characters That Might Require Special Handling

The following characters in a key name may require additional code handling and will likely need to be URL encoded or referenced as HEX. Some of these are non-printable characters and your browser may not handle them, which will also require special handling:

  • Ampersand ("&")
  • 'At' symbol ("@")
  • Colon (":")
  • Comma (",")
  • Dollar ("$")
  • Equals ("=")
  • Plus ("+")
  • Question mark ("?")
  • ASCII character ranges 00–1F hex (0–31 decimal) and 7F (127 decimal.)
  • Semicolon (";")
  • Space – Significant sequences of spaces may be lost in some uses (especially multiple spaces)

Characters to Avoid

You should avoid the following characters in a key name because of significant special handling for consistency across all applications.

  • Backslash ("\")
  • Caret ("^")
  • Grave accent / back tick ("`")
  • 'Greater Than' symbol (">")
  • 'Less Than' symbol ("<")
  • Left curly brace ("{")
  • Right curly brace ("}")
  • Right square bracket ("]")
  • Left square bracket ("[")
  • 'Pound' character ("#")
  • Non-printable ASCII characters (128–255 decimal characters)
  • Percent character ("%")
  • Quotation marks (""" and "'")
  • Tilde ("~")
  • Vertical bar / pipe ("|")
huyz
  • 2,297
  • 3
  • 25
  • 34
Manube
  • 5,110
  • 3
  • 35
  • 59
  • I have a scenario where I will receive the key as a parameter to an API to download that file. I am wondering if my api can somehow receives key (abc/def/filename.png ) as path parameter – Silly Volley Sep 12 '19 at 11:08
  • 1
    A large reason to listen to the advice on avoiding certain characters stems from the fact that the AWS SDKs in various languages use XML libraries, and not always correctly. The python SDK doesn't uniformly URL-encode as described in these quotes, so you have to avoid [illegal XML characters](https://stackoverflow.com/questions/1707890/fast-way-to-filter-illegal-xml-unicode-chars-in-python) – Indigenuity Dec 30 '21 at 16:30
  • Can you take a look here? https://stackoverflow.com/questions/72703759/url-encoded-format-for-s3-event-notification – Jnl Jun 22 '22 at 07:31
12

The only restrictions provided by Amazon is (as found on their Technical FAQ):

What characters are allowed in a bucket or object name?
A key is a sequence of Unicode characters whose UTF-8 encoding is at most 1024 bytes long.

Additional restrictions apply for Buckets (as found on the Rules for Bucket Naming section of their Bucket Restrictions and Limitations FAQ):

In all regions except for the US Standard region a bucket name must comply with the following rules. These result in a DNS compliant bucket name.

  • Bucket names must be at least 3 and no more than 63 characters long
  • Bucket name must be a series of one or more labels separated by a period (.), where each label:
    • Must start with a lowercase letter or a number
    • Must end with a lowercase letter or a number
    • Can contain lowercase letters, numbers and dashes
  • Bucket names must not be formatted as an IP address (e.g., 192.168.5.4)

Less permissive restrictions apply to the US standard region. Please see the FAQs for additional information and some examples. Hope it helps!

Community
  • 1
  • 1
Viccari
  • 9,029
  • 4
  • 43
  • 77
  • 7
    I'm concerned about the undocumented restrictions on object (key) names. Amazon claims any unicode works, but clearly '../../word' does not. I'm wondering what else isn't supported... – UsAaR33 Jul 03 '12 at 04:49
  • Looks like the answer is "No, there is not a document". I would recommend asking your question on the AWS forums. On a side note, here is a similar question (and answer :) ) : http://stackoverflow.com/questions/3146380/what-are-the-restrictions-on-object-ids-in-amazon-s3 – Viccari Jul 03 '12 at 10:48
  • @Downvoter: it would be good to have feedback as why you think the answer does not address the question. Or even better, an edit to the answer. – Viccari Mar 21 '14 at 17:43
  • `[` will give you grief (as I spent the past 2 hours troubleshooting) – dangel Jul 19 '19 at 03:40
  • 1
    Here's how to encode those pesky characters https://stackoverflow.com/questions/62818659/s3-is-encoding-urls-with-spaces-and-symbols-to-unkown-format – Neoheurist May 18 '21 at 17:45
  • There is a "yellow box" on the documentation page saying that the command-line and console apps have additional limitations on keys regarding leading "../" and strips trailing "." (which I guess is a windows filename convention). – Gem Taylor Oct 27 '21 at 10:48
  • Isn't the question about key names, and not bucket names? – AbdullahC Apr 07 '22 at 14:26
  • 1
    @AbdullahC yeah, the initial restriction is valid for both keys and buckets: "What characters are allowed in a bucket /or object/ name" – Viccari Apr 07 '22 at 14:49