42

Are there restricted character patterns within Azure TableStorage RowKeys? I've not been able to find any documented via numerous searches. However, I'm getting behavior that implies such in some performance testing.

I've got some odd behavior with RowKeys consisting on random characters (the test driver does prevent the restricted characters (/ \ # ?) plus blocking single quotes from occurring in the RowKey). The result is I've got a RowKey that will insert fine into the table, but cannot be queried (the result is InvalidInput). For example:

RowKey: 9}5O0J=5Z,4,D,{!IKPE,~M]%54+9G0ZQ&G34!G+

Attempting to query by this RowKwy (equality) will result in an error (both within our app, using Azure Storage Explorer, and Cloud Storage Studio 2). I took a look at the request being sent via Fiddler:

GET /foo()?$filter=RowKey%20eq%20'9%7D5O0J=5Z,4,D,%7B!IKPE,~M%5D%54+9G0ZQ&G34!G+' HTTP/1.1

It appears the %54 in the RowKey is not escaped in the filter. Interestingly, I get similar behavior for batch requests to table storage with URIs in the batch XML that include this RowKey. I've also seen similar behavior for RowKeys with embedded double quotes, though I have not isolated that pattern yet.

Has anyone co me across this behavior? I can easily restrict additional characters from occurring in RowKeys, but would really like to know the 'rules'.

abatishchev
  • 98,240
  • 88
  • 296
  • 433
MikeN
  • 463
  • 1
  • 4
  • 5
  • **See Also**: [How can I encode Azure storage table row keys and partition keys?](https://stackoverflow.com/q/21144694/1366033) – KyleMit Apr 14 '21 at 02:29

5 Answers5

64

The following characters are not allowed in PartitionKey and RowKey fields:

  • The forward slash (/) character
  • The backslash (\) character
  • The number sign (#) character
  • The question mark (?) character

Further Reading: Azure Docs > Understanding the Table service data model

Ian Kemp
  • 28,293
  • 19
  • 112
  • 138
Igorek
  • 15,716
  • 3
  • 54
  • 92
  • 3
    It's also worth noting that while you can have a % in a RowKey/PartitionKey you will have problems trying to retrieve any row that has a % in its key. MS are aware of the problem and looking into solutions: http://blogs.msdn.com/b/windowsazurestorage/archive/2012/05/28/partitionkey-or-rowkey-containing-the-percent-character-causes-some-windows-azure-tables-apis-to-fail.aspx – knightpfhor Jul 17 '12 at 10:33
  • Just found out that a . doesn't work either. Nowhere to be found. – André Boonzaaijer Feb 11 '20 at 09:16
  • I've had no issue with periods – jjxtra Apr 09 '20 at 00:27
30

public static readonly Regex DisallowedCharsInTableKeys = new Regex(@"[\\\\#%+/?\u0000-\u001F\u007F-\u009F]");

Detection of Invalid Table Partition and Row Keys:

bool invalidKey = DisallowedCharsInTableKeys.IsMatch(tableKey);

Sanitizing the Invalid Partition or Row Key:

string sanitizedKey = DisallowedCharsInTableKeys.Replace(tableKey, disallowedCharReplacement);

At this stage you may also want to prefix the sanitized key (Partition Key or Row Key) with the hash of the original key to avoid false collisions of different invalid keys having the same sanitized value.

Do not use the string.GetHashCode() though since it may produce different hash code for the same string and shall not be used to identify uniqueness and shall not be persisted.

I use SHA256: https://msdn.microsoft.com/en-us/library/s02tk69a(v=vs.110).aspx

to create the byte array hash of the invalid key, convert the byte array to hex string and prefix the sanitized table key with that.

Also see related MSDN Documentation: https://msdn.microsoft.com/en-us/library/azure/dd179338.aspx

Related Section from the link: Characters Disallowed in Key Fields

The following characters are not allowed in values for the PartitionKey and RowKey properties:

The forward slash (/) character

The backslash (\) character

The number sign (#) character

The question mark (?) character

Control characters from U+0000 to U+001F, including:

  • The horizontal tab (\t) character

  • The linefeed (\n) character

  • The carriage return (\r) character

Control characters from U+007F to U+009F

Note that in addition to the mentioned chars in the MSDN article, I also added the % char to the pattern since I saw in a few places where people mention it being problematic. I guess some of this also depends on the language and the tech you are using to access the table storage.

If you detect additional problematic chars in your case, then you can add those to the regex pattern, nothing else needs to change.

Michael Richardson
  • 4,213
  • 2
  • 31
  • 48
Dogu Arslan
  • 3,292
  • 24
  • 43
  • 1
    it worth to add "+" sign to your regex as mentioned by Hans Olav. I think you added % sign there with the same reasoning - it's allowed in the key, but hard to query as well as '+' sign. I'm not 100% sure but seems that all URL unfriendly chars must be encoded for querying... – Alexey Shcherbak Sep 16 '16 at 01:34
  • 1
    @DoguArslan - The regex starts with **verbatim** string `\\\\ ` . If I am not wrong, in case of verbatim, only `\\ ` are sufficient in pattern to match a `\ ` in string? Or am I missing something? – Shishir Gupta Jul 04 '20 at 11:13
  • 1
    @ShishirGupta I came across this article recently, and had to take a moment to parse the RegEx as well. When verifying through testing it became clear why the four backslashes are used here. It's only checking for the existence of `\\ ` in the string because any existence of a single `\ ` in data transmitted over HTTP can act as an escape character and give unexpected results to begin with. Therefore, that data needs to be sanitized before sending over HTTP, and only `\\ ` should be encountered by the time it gets to Azure storage. – AR9 Sep 22 '21 at 05:42
  • I do not have any trouble querying partition or row keys containing the % or + characters (using storage explorer to test). Seems these issues were resolved by Microsoft as some point? – Zach Oct 11 '21 at 18:25
10

I just found out (the hard way) that the '+' sign is allowed, but not possible to query in PartitionKey.

Hans Olav
  • 349
  • 3
  • 5
  • 1
    The amazing thing is that the single row insert does not check for the `+` which gives you trouble when trying to delete the item with a `+` in its row key. The _batch_ insert, however rejects any such items. – Good Night Nerd Pride Nov 21 '19 at 13:35
9

I found that in addition to the characters listed in Igorek's answer, these also can cause problems (e.g. inserts will fail):

  • |
  • []
  • {}
  • <>
  • $^&

Tested with the Azure Node.js SDK.

laktak
  • 57,064
  • 17
  • 134
  • 164
  • 2
    This answer should be backed up with some examples that illustrates the issue. I've successfully been able to do batch requests with all of the above listed characters as a `RowKey`. Haven't tested `PartitionKey` don't know if there's a difference. The limitation, if any, is probably in the Node.js SDK (or only applicable to an older Azure Storage API version). The only thing I found that wasn't documented is that the single quote `'` has to be escaped in URLs, not using URI encoding but using two consecutive single quote `'` characters. i.e. `'` -> `''`. – John Leidegren Feb 09 '14 at 15:48
  • I have also encountered problems when using white spaces in the partition key – Michael Staples Jul 03 '19 at 11:05
3

I transform the key using this function:

private static string EncodeKey(string key)
{
    return HttpUtility.UrlEncode(key);
}

This needs to be done for the insert and for the retrieve of course.

KyleMit
  • 30,350
  • 66
  • 462
  • 664
Gaston
  • 49
  • 5