223

Are there any forbidden characters in key names, for JavaScript objects or JSON strings? Or characters that need to be escaped?

To be more specific, I'd like to use "$", "-" and space in key names.

Steve Chambers
  • 37,270
  • 24
  • 156
  • 208
Christophe
  • 27,383
  • 28
  • 97
  • 140
  • I think partially this answer has to do with the way you're encoding. For example, UTF8 has different characters allowed versus ANSI. – invalidsyntax Dec 30 '11 at 04:00
  • 5
    You can use any 'key' you want in JS using the `obj['whatever']` notation. But only regular alphanumeric keys can be used for the `obj.whatever` version. – Marc B Dec 30 '11 at 04:05
  • 4
    @invalidsyntax: JSON is Unicode by definition. Also, ANSI isn't an encoding, it's a character set, so the comparison should be Unicode-vs-ANSI, not UTF-8-vs-ANSI. – Marcelo Cantos Dec 30 '11 at 04:06
  • 1
    Old discussion but, ASCII (what people often refer to by ANSI) is an encoding and on top of that it also defines a character set. – Trinidad Feb 23 '19 at 22:20

5 Answers5

230

No. Any valid string is a valid key. It can even have " as long as you escape it:

{"The \"meaning\" of life":42}

There is perhaps a chance you'll encounter difficulties loading such values into some languages, which try to associate keys with object field names. I don't know of any such cases, however.

Marcelo Cantos
  • 181,030
  • 38
  • 327
  • 365
  • 1
    Thx! Any other characters that would need to be escaped? Like : or ; ? – Christophe Dec 30 '11 at 04:18
  • 15
    Not those. Whatever needs escaping in JavaScript generally needs it in JSON. Best to get it from the horse's mouth, though, at json.org. It takes about one minute to read the entire spec end-to-end. – Marcelo Cantos Dec 30 '11 at 04:21
  • 5
    This is not a good answer imho. Which kind of characters need to be escaped? Which characters can be escaped, but don't have to be escaped? – Daniel W. Dec 18 '15 at 12:20
  • 2
    Can anyone clarify if this includes things like the Unicode null character (U+0000, plain "null byte" in UTF-8), etc? The both http://json.org and the linked official/formal ECMA specification PDF seem to imply that yes, those are valid in JSON, even in their literal forms (not just in the `\u four-hex-digits` form). – mtraceur Jun 16 '16 at 13:13
  • 1
    @mtraceur I was wondering the same thing. I tested in chrome's console and it doesn't like a key with null terminators in it. https://puu.sh/zJMIS/3d15c6d8e5.png It may not be mentioned in the spec, but don't expect parsers to accept it. best to avoid any ascii control characters I think. – Chris Rollins Mar 17 '18 at 20:58
  • @ChrisRollins Yeah, I wouldn't trust anything C-based to handle null bytes properly in arbitrary data. That said, do you know of any other ASCII control characters beyond NUL actually being special for modern software? The null byte has the unfortunate history of being treated as special in almost anything C-based, due to its early design choice as a micro-optimization that made sense in those days, but as far as I know the other ~32 or so ASCII control characters are only special on command-line terminals and the like, for whom interpreting them is part of the intended design. – mtraceur Mar 19 '18 at 18:20
  • @mtraceur Well I think it just depends on the software. Chrome's JSON parser doesn't like other control chars either. But I don't know more than what i'm observing here. By the way, since I just finished a C class, I'd like to say that there's no reason software compiled with C should inherently have problems with null bytes in binary data. It would if the the programmer made a mistake and used the cstring functions (ie. strcmp, strcpy, strlen, etc) on binary data but that's a huge mistake because binary data is never null terminated. Instead the size of the data must be known. – Chris Rollins Apr 02 '18 at 09:10
  • @ChrisRollins You're absolutely right. That said, for the boolean check "if the programmer made a mistake" my branch-prediction is strongly trained to expect a true result, especially with languages like C. Which is why, given no other information about a system other than "it was written in C", I would not trust it to handle null bytes - not because the language makes it impossible to handle them properly, but because most people writing C don't do it. – mtraceur Apr 02 '18 at 20:15
  • @mtraceur Well yeah you shouldn't expect mistake-free code, but in this particular case a program that uses cstring functions to process input data would probably just be broken on 100% of input data in the first place. – Chris Rollins Apr 02 '18 at 21:57
  • @ChrisRollins Has that been your experience? That code mistakenly written using C's string functions fails for most/arbitrary inputs? The C code design error I've most frequently seen relating to C code not handling null bytes properly has usually been the sort where arbitrary data inputs work fine unless they contain null bytes. We've probably been exposed to different C software. Anyway, it sounds like we agree: I don't expect mistake-free code, and using C string functions on binary data is a common enough C mistake that I automatically suspect C code of making it until I've checked it. – mtraceur Apr 02 '18 at 22:49
  • I think I see the problem in my initial reply: "wouldn't trust" can be read as "could never be trusted" - I meant "trust" in the sense of feeling comfortable in expecting an outcome *without* being able to verify that the outcome is assured. I can and do trust C code that I've written/audited/tested for handling null bytes to continue to do so. I just meant that given unknown C code, I'm more weary of it mishandling null bytes than unknown code written in languages that don't treat null bytes a special in their language/library design. – mtraceur Apr 02 '18 at 23:08
  • While the json spec allows any char, I would suggest only to use leading _ or [A-Z], and then after the first char any combination of [A-Z], [a-z], [0-9] and _. – jjxtra Jul 27 '18 at 18:30
  • Chinese or Arabic characters are fine for key? – c-an May 12 '20 at 09:28
  • I don't even know when I +1'ed this, but years later, I still approve . – iMe Dec 09 '20 at 13:09
  • I found an interesting example of a forbidden key string on Oracle's Netsuite (SuiteScript, which is I believe JavaScript running on some version of the Rhino Engine, or some proprietary derivative thereof). I tried making a key "" and try as I might, it would never come back with hasOwnProperty and it would never look up successfully. In fact at one point I had printed the object and it showed two identical keys of "". When I tried to just make a simple list and use indexOf it didn't find "" either. – Darren Ringer Mar 18 '22 at 20:46
74

The following characters must be escaped in JSON data to avoid any problems:

  • " (double quote)
  • \ (backslash)
  • all control characters like \n, \t

JSON Parser can help you to deal with JSON.

Arun Rana
  • 8,426
  • 14
  • 67
  • 107
  • 7
    Hi Arun, single quotes do not need to be escaped. Infact escaping them will cause strict JSON parsers to throw an exception. Refer to the string section of http://www.json.org Of course however you will need to escape them when inside a JSON string (but not the JSON itself). – Alex KeySmith Jan 04 '14 at 17:59
  • 7
    @AlexKey you're completely right! Arun, you can check this on [jsonlint.com](http://jsonlint.com/) by testing the JSON `{ "singlequotetest": "something here isn\'t right"}` versus `{ "singlequotetest": "Fixing here what wasn't right"}` – Adriano Sep 23 '14 at 10:51
  • @Arun Rana - no worries. – Alex KeySmith Sep 23 '14 at 12:55
  • 6
    { "*~@#$%^&*()_+=>/": "is a valid json" } – Abhi Oct 07 '14 at 04:24
  • 71
    `{"": "not nice, but still valid json"}` – Marcelo Cantos Dec 31 '15 at 02:02
  • You only need to escape special ASCII characters if you want to present it "as is". We escape single and double quotes only to tell the parser/compiler that it is not the enclosing pair. – mr5 Jun 30 '17 at 04:04
25

It is worth mentioning that while starting the keys with numbers is valid, it could cause some unintended issues.

Example:

var testObject = {
    "1tile": "test value"
};
console.log(testObject.1tile); // fails, invalid syntax
console.log(testObject["1tile"]; // workaround
karns
  • 5,391
  • 8
  • 35
  • 57
  • 6
    I really hope that, in this 2017/18 age of Microsoft, they are regretful of all the pain that they have inflicted. – monsto Dec 08 '17 at 21:47
  • 1
    Look at their metrics ID parameters: https://dev.applicationinsights.io/apiexplorer/metrics?appId=DEMO_APP&apiKey=DEMO_KEY&metricId=requests%2Fcount&timespan=PT1H ---15 or 20 of their fields have multiple forward slashes in their json field names. While Karns solution works for a specific field, I can't seem to get it to work for a sub-field of 1tile. E.g., a subsequent dot returns undefined for me. – Jon Luzader Dec 22 '17 at 03:21
  • This should be the best answer – Joe Elia May 08 '20 at 21:12
  • @JonLuzader Solution: `testObject["1tile"]["sub-field"]` – dolmen Mar 31 '22 at 07:11
  • 2
    −1. This is not an answer to the question. At best, it should be a comment. The syntax of JavaScript identifiers is a whole other topic. – Philippe-André Lorin Jul 01 '22 at 14:59
  • 1
    @Philippe-AndréLorin OP's question is tagged `Json` *and* `Javascript` – c z Jul 10 '23 at 11:21
15

Unicode codepoints U+D800 to U+DFFF must be avoided: they are invalid in Unicode because they are reserved for UTF-16 surrogate pairs. Some JSON encoders/decoders will replace them with U+FFFD. See for example how the Go language and its JSON library deals with them.

So avoid "\uD800" to "\uDFFF" alone (not in surrogate pairs).

dolmen
  • 8,126
  • 5
  • 40
  • 42
  • Especially for Go comma charecter also disallowed because of struct tags design of the language. Were was an issue about that: https://github.com/golang/go/issues/15000 Sadly it was closed as won't solved. – Alexander I.Grafov Apr 26 '23 at 11:57
1

Both JSON and JavaScript allow arbitrary strings as object property names, according to their own language definitions. The most recent JSON language definition documents are RFC 8259 for JSON and ECMA-262 for JavaScript.

The characters needing escaping in keys are the characters that are required to be escaped in any string in the language. These are also given in the language definition documents. For JSON, the characters required to be escaped are the quotation mark, backslash, and control characters. For Javascript, the characters requiring escaping are the quote character matching the enclosing quotes (single or double), backslash, carriage return, and line feed.

For your specific example, all of "$", "-" and space are allowed as keys of both JSON & JavaScript objects with no escaping required.

Object property names

Per RFC 8259, there are no limits imposed on the value of strings used as JSON object names:

An object structure is represented as a pair of curly brackets surrounding zero or more name/value pairs (or members). A name is a string. A single colon comes after each name, separating the name from the value. A single comma separates a value from a following name. The names within an object SHOULD be unique.

   object = begin-object [ member *( value-separator member ) ]
            end-object

   member = string name-separator value

Regarding JavaScript, ECMA-262 explicitly states that all strings are valid object property names:

A property key value is either an ECMAScript String value or a Symbol value. All String and Symbol values, including the empty String, are valid as property keys. A property name is a property key that is a String value.

String escaping

RFC 8259 lists the characters that must be escaped in JSON:

All Unicode characters may be placed within the quotation marks, except for the characters that MUST be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F).

ECMA-262 lists the characters that must be escaped in JavaScript:

A string literal is 0 or more Unicode code points enclosed in single or double quotes. […] All code points may appear literally in a string literal except for the closing quote code points, U+005C (REVERSE SOLIDUS), U+000D (CARRIAGE RETURN), and U+000A (LINE FEED).

The closing quote code point would be " if the string is enclosed in double quotes, and ' if it is enclosed in single quotes.

M. Justin
  • 14,487
  • 7
  • 91
  • 130