4

Are there any Python JSON parsers that will cope with trailing commas?

(I'm consuming the "JSON" from an external source and have no control over it.)

Acorn
  • 49,061
  • 27
  • 133
  • 172
  • 3
    But you do have control over what you do between 1) retrieving the JSON and 2) feeding it to the JSON parser. – Simeon Visser Jun 15 '12 at 14:47
  • 3
    JSON does not contain trailing commas. – ThiefMaster Jun 15 '12 at 14:47
  • Right, one option would be to try and clean the data before parsing it. I only wondered if there might be a more lenient JSON parser as supposedly some browsers can cope with trailing commas in JSON. – Acorn Jun 15 '12 at 14:49
  • 3
    Report the malformed json to its provider. Its not useful to anyone if they're not outputting it correctly. – Eric Jun 15 '12 at 15:06
  • 2
    @Acorn: Just because some browsers accept it doesn't mean your code should, too -- that's how non-conforming software gets propagated -- so do what Eric said. – martineau Jun 15 '12 at 15:08
  • Possible duplicate of [Can json.loads ignore trailing commas?](https://stackoverflow.com/questions/23705304/can-json-loads-ignore-trailing-commas) – Steve Lorimer Jul 05 '17 at 17:05

2 Answers2

6

Grab PyYAML. JSON is a subset of YAML, so a YAML parser should parse most JSON. YAML's grammar allows trailing commas in sequences.

kenm
  • 23,127
  • 2
  • 43
  • 62
  • 1
    I did see this suggested somewhere as a solution, but also saw mention of spaces after colons being mandatory, which would make it a lot less useful. After testing it out, it seems that the space isn't mandatory as long as the key is a quoted string. If the JSON has a key that's a number and no space after the colon, it wont parse. – Acorn Jun 15 '12 at 16:09
  • 1
    Im assuming thats pretty rare for a number to be a key, but great job finding that. Maybe mention this functionality to PyYAML maintainers and explain the use-case to them as well. – austinheiman Jan 24 '15 at 03:42
  • That is an awesome little hack, especially for scraping from sources where you cannot control the output. (ie - web scraping) – Sam Texas Oct 21 '16 at 13:14
  • "YAML's grammar allows trailing commas in sequences." source? Which part of the spec? – MarcH Nov 30 '22 at 18:58
  • 1
    @MarcH per https://yaml.org/spec/1.2.2/#74-flow-collection-styles (If I understand it correctly) "Flow collection entries are terminated by the “,” indicator. The final “,” may be omitted." – kenm Nov 30 '22 at 22:11
3

json-cfg appears to support an extension of JSON that allows it. It also allows comments and unquoted keys.

>>> import jsoncfg
>>> jsoncfg.loads('{"key1": "{my tricky value,}", }')
OrderedDict([('key1', '{my tricky value,}')])

The extra options (comments and unquoted keys) can be disabled with the [JSONParserParams] class:

jsoncfg.loads('{"key1": "{my tricky value,}" /*nope*/}', jsoncfg.JSONParserParams(allow_comments=False, allow_unquoted_keys=False))

This comes without all the concern about allowing the entire YAML syntax. Furthermore, unlike regex-based preprocessing and overly simple modules such as jsoncomment, it implements a full blown tokenizer and parser (as befits a non-regular language) to avoid nesting problems (like when a comma trails a ] or } inside a string).

Whether this library is still maintained or not is an open question. It could definitely use a bit more documentation.

jpmc26
  • 28,463
  • 14
  • 94
  • 146