354

What are the allowed characters in both cookie name and value? Are they same as URL or some common subset?

Reason I'm asking is that I've recently hit some strange behavior with cookies that have - in their name and I'm just wondering if it's something browser specific or if my code is faulty.

YakovL
  • 7,557
  • 12
  • 62
  • 102
Esko
  • 29,022
  • 11
  • 55
  • 82

13 Answers13

448

According to the ancient Netscape cookie_spec the entire NAME=VALUE string is:

a sequence of characters excluding semi-colon, comma and white space.

So - should work, and it does seem to be OK in browsers I've got here; where are you having trouble with it?

By implication of the above:

  • = is legal to include, but potentially ambiguous. Browsers always split the name and value on the first = symbol in the string, so in practice you can put an = symbol in the VALUE but not the NAME.

What isn't mentioned, because Netscape were terrible at writing specs, but seems to be consistently supported by browsers:

  • either the NAME or the VALUE may be empty strings

  • if there is no = symbol in the string at all, browsers treat it as the cookie with the empty-string name, ie Set-Cookie: foo is the same as Set-Cookie: =foo.

  • when browsers output a cookie with an empty name, they omit the equals sign. So Set-Cookie: =bar begets Cookie: bar.

  • commas and spaces in names and values do actually seem to work, though spaces around the equals sign are trimmed

  • control characters (\x00 to \x1F plus \x7F) aren't allowed

What isn't mentioned and browsers are totally inconsistent about, is non-ASCII (Unicode) characters:

  • in Opera and Google Chrome, they are encoded to Cookie headers with UTF-8;
  • in IE, the machine's default code page is used (locale-specific and never UTF-8);
  • Firefox (and other Mozilla-based browsers) use the low byte of each UTF-16 code point on its own (so ISO-8859-1 is OK but anything else is mangled);
  • Safari simply refuses to send any cookie containing non-ASCII characters.

so in practice you cannot use non-ASCII characters in cookies at all. If you want to use Unicode, control codes or other arbitrary byte sequences, the cookie_spec demands you use an ad-hoc encoding scheme of your own choosing and suggest URL-encoding (as produced by JavaScript's encodeURIComponent) as a reasonable choice.

In terms of actual standards, there have been a few attempts to codify cookie behaviour but none thus far actually reflect the real world.

  • RFC 2109 was an attempt to codify and fix the original Netscape cookie_spec. In this standard many more special characters are disallowed, as it uses RFC 2616 tokens (a - is still allowed there), and only the value may be specified in a quoted-string with other characters. No browser ever implemented the limitations, the special handling of quoted strings and escaping, or the new features in this spec.

  • RFC 2965 was another go at it, tidying up 2109 and adding more features under a ‘version 2 cookies’ scheme. Nobody ever implemented any of that either. This spec has the same token-and-quoted-string limitations as the earlier version and it's just as much a load of nonsense.

  • RFC 6265 is an HTML5-era attempt to clear up the historical mess. It still doesn't match reality exactly but it's much better then the earlier attempts—it is at least a proper subset of what browsers support, not introducing any syntax that is supposed to work but doesn't (like the previous quoted-string).

In 6265 the cookie name is still specified as an RFC 2616 token, which means you can pick from the alphanums plus:

!#$%&'*+-.^_`|~

In the cookie value it formally bans the (filtered by browsers) control characters and (inconsistently-implemented) non-ASCII characters. It retains cookie_spec's prohibition on space, comma and semicolon, plus for compatibility with any poor idiots who actually implemented the earlier RFCs it also banned backslash and quotes, other than quotes wrapping the whole value (but in that case the quotes are still considered part of the value, not an encoding scheme). So that leaves you with the alphanums plus:

!#$%&'()*+-./:<=>?@[]^_`{|}~

In the real world we are still using the original-and-worst Netscape cookie_spec, so code that consumes cookies should be prepared to encounter pretty much anything, but for code that produces cookies it is advisable to stick with the subset in RFC 6265.

Boris Verkhovskiy
  • 14,854
  • 11
  • 100
  • 103
bobince
  • 528,062
  • 107
  • 651
  • 834
  • @bobince Do you mean that the RFC states that cookie values can have the `;` character as long as it is surrounded by double-quotes? As such: `Set-Cookie: Name=Va";"lue; Max-Age=3600` – Pacerier Jul 03 '12 at 13:34
  • @Pacerier: the whole value would have to be a quoted-string, so it would have to be `Name="Va;lue"; max-age...`. It doesn't work in browsers and it's not permitted in RFC 6265, which is proposed to replace 2965 and tries to reflect reality a little better. – bobince Jul 03 '12 at 13:47
  • @bobince - I know this is old, but am I reading your answer correctly to mean that spaces are not technically allowed in cookie values? *"excluding semi-colon, comma and **white space**"* [emphasis mine] – Adam Rackis Feb 08 '13 at 03:28
  • 1
    @Adam: Yes, if you're going by the Netscape spec or RFC 6265, whitespace is not permitted in a raw (un-DQUOTEd) cookie value. It does nonetheless work in browsers I've tried, but I wouldn't rely on it. – bobince Feb 08 '13 at 11:29
  • So even non-encoded Chinese or Russian Unicode characters are allowed in the cookie value? – Timo Huovinen Sep 24 '13 at 10:37
  • @Timo: RFC 6265 says no; Netscape cookie spec sort-of implies yes, but that spec is so loosely written it is dangerous to draw any implications not explicitly stated. In reality of course browser behaviour is too inconsistent. – bobince Sep 24 '13 at 11:34
  • Why does http://docs.oracle.com/javaee/7/api/javax/servlet/http/Cookie.html#setValue(java.lang.String) and http://stackoverflow.com/a/1969287/14731 imply that version 0 values may not contain equal signs? I don't see this mentioned in the spec. – Gili Jun 16 '14 at 22:00
  • Some servers (Tomcat application server) will discard "invalid" cookie values. – Barett Aug 21 '14 at 18:52
  • @Gili The spec implies that = is invalid because it is a separator character, and not specifically mentioned as illegal in either the name or values. It is not specifically stated, however the v0 spec is generally accepted to be poorly written. – Barett Aug 21 '14 at 18:53
  • @Barett, makes sense. Given `KEY=VALUE` (with no whitespace) there would be no way to differentiate between a key ending with `=` and the separator character. Thanks for the clarification. – Gili Aug 22 '14 at 14:34
  • I know this is old, but apparently Firefox also encodes non-ASCII characters with UTF-8. [Encoding scheme used for cookies](http://stackoverflow.com/q/25665703/1290953) – Bharat Khatri Sep 09 '14 at 09:38
  • Regarding 6265, does this say that the `=` character (0x3D) is valid? You have listed it as a valid character in your answer within the `0x3C-5B` range. – Adam Burley Oct 23 '15 at 15:32
  • `=` is explicitly valid inside cookie-value in both RFC 6265 and original cookie_spec as well as supported by all browsers. They were described as invalid in non-quoted values by RFC 2965, but that specification was utter balderdash. – bobince Oct 24 '15 at 11:49
  • @bobince it looks to me like RFC 2109 described `=` as invalid, too, because the value was a `token` per RFC 2068. – Shannon Feb 09 '16 at 23:50
  • @Shannon would be invalid in `token` yes, but `value` can also be `quoted-string`. For what that standard is worth (nil)! – bobince Feb 11 '16 at 20:06
  • The last version contains = which would be probablematic when reading cookie values. – mjs Aug 06 '17 at 21:11
  • 7
    [RFC 6265](https://www.ietf.org/rfc/rfc2616.txt) defines token as `1*` and separators are `(`, `)`, `<`, `>`, `@`, `,`, `;`, `:`, ``\``, `"`, `/`, `[`, `]`, `?`, `=`, `{`, `}`, `SP` and `HT`, so cookie names should be alphanums plus ``!#$%&'*+-.?^_`|~`` – Gan Quan Aug 05 '19 at 22:47
  • Testing this recently with PHP has given results which are somewhat different from the above. If I send this header: Set-Cookie: testing-!#$%&'()*+-./:<>?@[]^_`{|}~=123; expires=... I get this header back: Cookie: testing-!#$%&'()*+-./:<>?@[]^_`{|}~=123 but the PHP $_COOKIE array has this: [testing-!#$%&'()*_-_/:<>?@] => [123] PHP converts all periods and plus signs in cookie names to underscores, treats square brackets as array declarations, and ignores everything after the closing square bracket. – Niall Jackson Oct 09 '19 at 09:29
  • Really, you can't put whitespace and commas in a cookie value even if it is quoted? So `Cookie: CookieControl="x,z"` is not a strictly legal request header? The real value (https://www.midasgroup.co.uk/cookies.html) is even more complicated. – jamshid Jun 11 '21 at 02:08
29

In ASP.Net you can use System.Web.HttpUtility to safely encode the cookie value before writing to the cookie and convert it back to its original form on reading it out.

// Encode
HttpUtility.UrlEncode(cookieData);

// Decode
HttpUtility.UrlDecode(encodedCookieData);

This will stop ampersands and equals signs spliting a value into a bunch of name/value pairs as it is written to a cookie.

stephen
  • 1,200
  • 1
  • 13
  • 16
  • 3
    Just one note, internally asp.net uses hex encoding instead of UrlEncode when storing the authentication cookie. https://referencesource.microsoft.com#System.Web/Security/FormsAuthentication.cs,507f5d4465177372 so there might be some cases where url encode wont cut it? – Peter Nov 15 '16 at 10:49
  • 1
    This has nothing to do with the question. – Boris Verkhovskiy Mar 25 '22 at 06:14
18

I think it's generally browser specific. To be on the safe side, base64 encode a JSON object, and store everything in that. That way you just have to decode it and parse the JSON. All the characters used in base64 should play fine with most, if not all browsers.

Jamie Rumbelow
  • 4,967
  • 2
  • 30
  • 42
  • 1
    This answer seems to be the consistent one across browsers. I realised this after working many hours trying to get a quick solution: I did not get one either. Just do as recommended exactly above to save yourself the hassles. – smile Mar 11 '18 at 18:02
  • Didn't try this, but I read other posts about this saying that base64 encode only works with ascii characters. – user984003 Sep 07 '19 at 15:29
  • These days, instead of base64 encoding JSON in a cookie, you could try using the browser's [`window.localStorage`](https://developer.mozilla.org/en-US/docs/Web/API/Window/localStorage) or sending the data you need in your request instead of in the cookie. – Boris Verkhovskiy Mar 25 '22 at 06:15
18

Here it is, in as few words as possible. Focus on characters that need no escaping:

For cookies:

abdefghijklmnqrstuvxyzABDEFGHIJKLMNQRSTUVXYZ0123456789!#$%&'()*+-./:<>?@[]^_`{|}~

For urls

abdefghijklmnqrstuvxyzABDEFGHIJKLMNQRSTUVXYZ0123456789.-_~!$&'()*+,;=:@

For cookies and urls ( intersection )

abdefghijklmnqrstuvxyzABDEFGHIJKLMNQRSTUVXYZ0123456789!$&'()*+-.:@_~

That's how you answer.

Note that for cookies, the = has been removed because it is usually used to set the cookie value.

For urls this the = was kept. The intersection is obviously without.

var chars = "abdefghijklmnqrstuvxyz"; chars += chars.toUpperCase() + "0123456789" + "!$&'()*+-.:@_~";

Turns out escaping still occuring and unexpected happening, especially in a Java cookie environment where the cookie is wrapped with double quotes if it encounters the last characters.

So to be safe, just use A-Za-z1-9. That's what I am going to do.

mjs
  • 21,431
  • 31
  • 118
  • 200
  • 1
    Safari Cookies was my only problem browser – all others browsers worked fine. I had to UrlEncode and UrlDecode my cookie to deal with equal = signs and spaces. Like a Base64Encode in the Cookie. (Safari Only required this- other browsers worked fine with and without the encoded cookie.) – Sql Surfer Apr 25 '18 at 17:06
  • It is better if you list what sources leading to the your answer! – LHA Jun 07 '18 at 19:33
  • 3
    @Loc Over 3 hours of trial and inspecting. – mjs Jun 09 '18 at 01:28
13

Newer rfc6265 published in April 2011:

cookie-header = "Cookie:" OWS cookie-string OWS
cookie-string = cookie-pair *( ";" SP cookie-pair )
cookie-pair  = cookie-name "=" cookie-value
cookie-value = *cookie-octet / ( DQUOTE *cookie-octet DQUOTE )

cookie-octet = %x21 / %x23-2B / %x2D-3A / %x3C-5B / %x5D-7E
                   ; US-ASCII characters excluding CTLs,
                   ; whitespace DQUOTE, comma, semicolon,
                   ; and backslash

If you look to @bobince answer you see that newer restrictions are more strict.

Community
  • 1
  • 1
gavenkoa
  • 45,285
  • 19
  • 251
  • 303
  • 2
    errata in the rail diagram you copied above and the text on section 5.4... The diagram separates with `; OWS`, the text calls for a literal `; ` (with space). – Gabriel Jun 21 '21 at 21:21
8

you can not put ";" in the value field of a cookie, the name that will be set is the string until the ";" in most browsers...

hagay onn
  • 97
  • 1
  • 1
4

that's simple:

A <cookie-name> can be any US-ASCII characters except control characters (CTLs), spaces, or tabs. It also must not contain a separator character like the following: ( ) < > @ , ; : \ " / [ ] ? = { }.

A <cookie-value> can optionally be set in double quotes and any US-ASCII characters excluding CTLs, whitespace, double quotes, comma, semicolon, and backslash are allowed. Encoding: Many implementations perform URL encoding on cookie values, however it is not required per the RFC specification. It does help satisfying the requirements about which characters are allowed for though.

Link: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Set-Cookie#Directives

Community
  • 1
  • 1
webolizzer
  • 337
  • 2
  • 5
3

There are 2 versions of cookies specifications
1. Version 0 cookies aka Netscape cookies,
2. Version 1 aka RFC 2965 cookies
In version 0 The name and value part of cookies are sequences of characters, excluding the semicolon, comma, equals sign, and whitespace, if not used with double quotes
version 1 is a lot more complicated you can check it here
In this version specs for name value part is almost same except name can not start with $ sign

Ricardo Souza
  • 16,030
  • 6
  • 37
  • 69
Tinku
  • 1,592
  • 1
  • 15
  • 27
1

There is another interesting issue with IE and Edge. Cookies that have names with more than 1 period seem to be silently dropped. So This works:

cookie_name_a=valuea

while this will get dropped

cookie.name.a=valuea

Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
Arvoreen
  • 31
  • 2
1

One more consideration. I recently implemented a scheme in which some sensitive data posted to a PHP script needed to convert and return it as an encrypted cookie, that used all base64 values I thought were guaranteed 'safe". So I dutifully encrypted the data items using RC4, ran the output through base64_encode, and happily returned the cookie to the site. Testing seemed to go well until a base64 encoded string contained a "+" symbol. The string was written to the page cookie with no trouble. Using the browser diagnostics I could also verify the cookies was written unchanged. Then when a subsequent page called my PHP and obtained the cookie via the $_COOKIE array, I was stammered to find the string was now missing the "+" sign. Every occurrence of that character was replaced with an ASCII space.

Considering how many similar unresolved complaints I've read describing this scenario since then, often siting numerous references to using base64 to "safely" store arbitrary data in cookies, I thought I'd point out the problem and offer my admittedly kludgy solution.

After you've done whatever encryption you want to do on a piece of data, and then used base64_encode to make it "cookie-safe", run the output string through this...

// from browser to PHP. substitute troublesome chars with 
// other cookie safe chars, or vis-versa.  

function fix64($inp) {
    $out =$inp;
    for($i = 0; $i < strlen($inp); $i++) {
        $c = $inp[$i];
        switch ($c) {
            case '+':  $c = '*'; break; // definitly won't transfer!
            case '*':  $c = '+'; break;

            case '=':  $c = ':'; break; // = symbol seems like a bad idea
            case ':':  $c = '='; break;

            default: continue;
            }
        $out[$i] = $c;
        }
    return $out;
    }

Here I'm simply substituting "+" (and I decided "=" as well) with other "cookie safe" characters, before returning the encoded value to the page, for use as a cookie. Note that the length of the string being processed doesn't change. When the same (or another page on the site) runs my PHP script again, I'll be able to recover this cookie without missing characters. I just have to remember to pass the cookie back through the same fix64() call I created, and from there I can decode it with the usual base64_decode(), followed by whatever other decryption in your scheme.

There may be some setting I could make in PHP that allows base64 strings used in cookies to be transferred back to to PHP without corruption. In the mean time this works. The "+" may be a "legal" cookie value, but if you have any desire to be able to transmit such a string back to PHP (in my case via the $_COOKIE array), I'm suggesting re-processing to remove offending characters, and restore them after recovery. There are plenty of other "cookie safe" characters to choose from.

Randy
  • 301
  • 2
  • 11
0

If you are using the variables later, you'll find that stuff like path actually will let accented characters through, but it won't actually match the browser path. For that you need to URIEncode them. So i.e. like this:

  const encodedPath = encodeURI(myPath);
  document.cookie = `use_pwa=true; domain=${location.host}; path=${encodedPath};`

So the "allowed" chars, might be more than what's in the spec. But you should stay within the spec, and use URI-encoded strings to be safe.

odinho - Velmont
  • 20,922
  • 6
  • 41
  • 33
-1

Years ago MSIE 5 or 5.5 (and probably both) had some serious issue with a "-" in the HTML block if you can believe it. Alhough it's not directly related, ever since we've stored an MD5 hash (containing letters and numbers only) in the cookie to look up everything else in server-side database.

FYA
  • 402
  • 4
  • 6
-1

I ended up using

cookie_value = encodeURIComponent(my_string);

and

my_string = decodeURIComponent(cookie_value);

That seems to work for all kinds of characters. I had weird issues otherwise, even with characters that weren't semicolons or commas.

user984003
  • 28,050
  • 64
  • 189
  • 285