61

I have a json file, such as the following:

    { 
       "author":"John",
       "desc": "If it is important to decode all valid JSON correctly \ 
and  speed isn't as important, you can use the built-in json module,   \
 orsimplejson.  They are basically the same but sometimes simplej \
further along than the version of it that is included with \
distribution."
       //"birthday": "nothing" //I comment this line
    }

This file is auto created by another program. How do I parse it with Python?

MattDMo
  • 100,794
  • 21
  • 241
  • 231
BollMose
  • 3,002
  • 4
  • 32
  • 41

12 Answers12

28

I recommend everyone switch to a JSON5 library instead. JSON5 is JSON with JavaScript features/support. It's the most popular JSON language extension in the world. It has comments, support for trailing commas in objects/arrays, support for single-quoted keys/strings, support for unquoted object keys, etc. And there's proper parser libraries with deep test suites and everything working perfectly.

There are two different, high-quality Python implementations:

Here's the JSON5 spec: https://json5.org/

MattDMo
  • 100,794
  • 21
  • 241
  • 231
Mitch McMabers
  • 3,634
  • 28
  • 27
27

jsoncomment is good, but inline comment is not supported.

Check out jstyleson, which support

  • inline comment
  • single-line comment
  • multi-line comment
  • trailing comma.

Comments are NOT preserved. jstyleson first removes all comments and trailing commas, then uses the standard json module. It seems like function arguments are forwarded and work as expected. It also exposes dispose to return the cleaned string contents without parsing.

Example

Install

pip install jstyleson

Usage

import jstyleson
result_dict = jstyleson.loads(invalid_json_str) # OK
jstyleson.dumps(result_dict)
plswork04
  • 589
  • 6
  • 11
Jackson Lin
  • 419
  • 4
  • 4
  • 13
    It should be clearly noted that you are the author of `jstyleson`. I think this post is ok, as it is a way of solving the OP's problem, but self-advertising is generally frowned upon unless explicitly called out. – MattDMo Jun 11 '21 at 13:40
7

I have not personally used it, but the jsoncomment python package supports parsing a JSON file with comments.

You use it in place of the JSON parser as follows:

parser = JsonComment(json)
parsed_object = parser.loads(jsonString)
studgeek
  • 14,272
  • 6
  • 84
  • 96
  • 1
    This package strips comments only at the beginning of line. So you are unable to parse `[1,2,3,/* a comment */ 10]`. – Sergei May 26 '17 at 12:24
  • JsonComment removes trailing commas via simple replacement (so it removes a string containing ,] or ,}). Additionally it doesn't remove trailing commas if they have a space after them. – Zezombye Feb 07 '20 at 14:15
6

Improving a previous answer to provide correct line number support in case of removed lines:

import json

class JSONWithCommentsDecoder(json.JSONDecoder):
    def __init__(self, **kw):
        super().__init__(**kw)

    def decode(self, s: str) -> Any:
        s = '\n'.join(l if not l.lstrip().startswith('//') else '' for l in s.split('\n'))
        return super().decode(s)

your_obj = json.load(f, cls=JSONWithCommentsDecoder)

This implementation slightly improves the previous answer by replacing the comment line by an empty line rather than removing it completely because this breaks the line count.

Louis Caron
  • 1,043
  • 1
  • 11
  • 17
5

I can not imagine a json file "auto created by other program" would contain comments inside. Because json spec defines no comment at all, and that is by design, so no json library would output a json file with comment.

Those comments are usually added later, by a human. No exception in this case. The OP mentioned that in his post: //"birthday": "nothing" //I comment this line.

So the real question should be, how do I properly comment some content in a json file, yet maintaining its compliance with spec and hence its compatibility with other json libraries?

And the answer is, rename your field to another name. Example:

{
    "foo": "content for foo",
    "bar": "content for bar"
}

can be changed into:

{
    "foo": "content for foo",
    "this_is_bar_but_been_commented_out": "content for bar"
}

This will work just fine most of the time because the consumer will very likely ignore unexpected fields (but not always, it depends on your json file consumer's implementation. So YMMV.)

UPDATE: Apparently some reader was unhappy because this answer does not give the "solution" they expect. Well, in fact, I did give a working solution, by implicitly linking to the JSON designer's quote:

Douglas Crockford Public Apr 30, 2012 Comments in JSON

I removed comments from JSON because I saw people were using them to hold parsing directives, a practice which would have destroyed interoperability. I know that the lack of comments makes some people sad, but it shouldn't.

Suppose you are using JSON to keep configuration files, which you would like to annotate. Go ahead and insert all the comments you like. Then pipe it through JSMin before handing it to your JSON parser.

So, yeah, go ahead to use JSMin. Just keep in mind that when you are heading towards "using comments in JSON", that is a conceptually uncharted territory. There is no guarantee that whatever tools you choose would handle: inline [1,2,3,/* a comment */ 10], Python style [1, 2, 3] # a comment (which is a comment in Python but not in Javascript), INI style [1, 2, 3] ; a comment, ..., you get the idea.

I would still suggest to NOT adding noncompliant comments in JSON in the first place.

Xan
  • 74,770
  • 16
  • 179
  • 206
RayLuo
  • 17,257
  • 6
  • 88
  • 73
  • 4
    `tsc --init` (typescript) generates a `tsconfig.json` with comments I believe. – John May 07 '17 at 17:27
  • @John True, but that is exactly the reason why its output is NOT strickly json. You would `needs to pipe the file through JSMin before parsing it`. (Quoted from [here](https://github.com/Microsoft/TypeScript/issues/3079#issuecomment-103590507)). – RayLuo May 08 '17 at 06:45
  • 1
    phpmyadmin JSON exporter adds comments in both `/* */` and `//` forms. – Lithy May 22 '17 at 22:58
  • @Lithy phpmyadmin JSON exporter MAY add comments in both `/* */` and `//` forms AT ONE POINT, but it looks like that is considered a bug and [they changed that behavior finally](https://github.com/phpmyadmin/phpmyadmin/issues/12307). More generally speaking, there might be some programs that generate non-standard json output, but then need extra post-process to convert the output into valid json (or in this case, they need to fix the program later). These programs are NOT qualified as an evidence to support "json can contain comments". Now, may I have your upvote on my answer? ;-) – RayLuo May 23 '17 at 19:24
  • @RayLuo totaly agree that they are no evidence! Didn't know for phpmyadmin, I should update. Also, to add comments within JSON I usually use the empty key: `{"": "Some comment"}` – Lithy May 27 '17 at 12:55
  • @z33k I gave my answer to "how to parse json file with comments", it was just that the answer happens to be essentially "you don't". And I backed it up with explanation and workaround which the OP and commenters above happened to agree with. And that famous guy I quoted, also happens to be "universally agreed upon" as the [JSON inventor](https://en.wikipedia.org/wiki/JSON#History). You do not have to like all those, as much as I don't have to customize an answer for any particular taste. As the saying goes, "I disapprove of what you say, but I will defend to the death your right to say it". :) – RayLuo Jun 12 '19 at 23:46
  • 3
    @RayLuo: I don't want this comments section to morph into useless banter, so: 1) I added [my own answer clarifying what you chose not to](https://stackoverflow.com/a/56574294/4465708) and 2) as to the "universally agreed upon" let me just point you to these little known code editors: Sublime Text, Atom, VS Code (all of them use JSON for configuration) and let the matter rest at that – z33k Jun 13 '19 at 06:27
  • 11
    I really cannot abide the mindset that features should be removed because they *might* be abused. Thanks to this we now have a plethora of competing JSON alternatives, because plain JSON does not support a common and reasonable use case. Shelling out to pre-process a configuration file, or having to "build" your configuration does not strike me as a sensible approach, it just increases impedance. It makes simple things hard, which is the opposite of what we should be trying to achieve. – cdyson37 Jul 31 '19 at 13:24
  • @cdyson37, your constructive criticism is for the original decision maker Mr. Crockford. I can not speak for him but, based the quote from him, he did not remove that feature because people *might* abuse it, he actually *saw* people doing that. And, if the price would argurably be "destroyed interoperability", I would rather pay it to have a less-flexible-but-universal json, than to have *many* versatile-but-less-popular alternatives. I certainly wish json would support comment, but it doesn't, [and the rest is history](https://trends.google.com/trends/explore?date=all&q=json,yaml,xml). – RayLuo Jul 31 '19 at 19:20
  • 9
    True. It should be noted that the addition of comments to HTML didn't stop interoperability there. You could also sneak hints to parsers with trailing whitespace, but that's not disallowed. Whitespace is flexible as a concession to human authors. Personally I think JSON falls between two stools: it's sort of a wire format (no comments allowed) but designed for humans to edit (whitespace flexible). I do hold out hope that one day there will be an agreement to allow comments, but then it would take years for fussy tools and libraries to catch up. – cdyson37 Aug 01 '19 at 10:39
  • 1
    thanks for your explanations. For other readers, also check the most popular discussion about Json comments on SO : https://stackoverflow.com/questions/244777/can-comments-be-used-in-json – R. Du Nov 19 '20 at 17:55
5

For the [95% of] cases when you just need simple leading // line comments with a simple way to handle them:

import json

class JSONWithCommentsDecoder(json.JSONDecoder):
    def __init__(self, **kw):
        super().__init__(**kw)

    def decode(self, s: str) -> Any:
        s = '\n'.join(l for l in s.split('\n') if not l.lstrip(' ').startswith('//'))
        return super().decode(s)

your_obj = json.load(f, cls=JSONWithCommentsDecoder)

nivedano
  • 116
  • 1
  • 4
4

How about commentjson?

http://commentjson.readthedocs.io/en/latest/

This can parse something like below.

{
    "name": "Vaidik Kapoor", # Person's name
    "location": "Delhi, India", // Person's location

    # Section contains info about
    // person's appearance
    "appearance": {
        "hair_color": "black",
        "eyes_color": "black",
        "height": "6"
    }
}

Likely elasticsearch, some products' REST API do not accept comment field. Therefore, I think comment inside json is necessary for a client in order to maintain such as a json template.


EDITED

jsmin seems to be more common.

https://pypi.python.org/pypi/jsmin

tabata
  • 449
  • 6
  • 17
4

in short: use jsmin

pip install jsmin

import json
from jsmin import jsmin

with open('parameters.jsonc') as js_file:
    minified = jsmin(js_file.read())
parameters  = json.loads(minified)
Pablo
  • 3,135
  • 4
  • 27
  • 43
2

If you are like me who prefers avoiding external libraries, this function I wrote will read json from a file and remove "//" and "/* */" type comments:

def GetJsonFromFile(filePath):
    contents = ""
    fh = open(filePath)
    for line in fh:
        cleanedLine = line.split("//", 1)[0]
        if len(cleanedLine) > 0 and line.endswith("\n") and "\n" not in cleanedLine:
            cleanedLine += "\n"
        contents += cleanedLine
    fh.close
    while "/*" in contents:
        preComment, postComment = contents.split("/*", 1)
        contents = preComment + postComment.split("*/", 1)[1]
    return contents

Limitations: As David F. brought up in the comments, this will break beautifully (ie: horribly) with // and /* inside string literals. Would need to write some code around it if you want to support //, /*, */ within your json string contents.

deleb
  • 569
  • 4
  • 6
  • 2
    Note that this implementation will incorrectly identify "//" and "/*" inside string literals as comment start markers and will give strange results in that scenario. – David Foster Nov 20 '20 at 17:49
  • Indeed! Thanks for bringing that up. – deleb Nov 24 '20 at 20:58
1

C-style comments are officially part of the JSON5 specification.

❗️Important: Before you go any further please note that JSON5 and JSON are two different formats although compatible.

From json5.org:

JSON5 is an extension to the popular JSON file format that aims to be easier to write and maintain by hand (e.g. for config files). It is not intended to be used for machine-to-machine communication. (Keep using JSON or other file formats for that. )


  1. Install json5 with:
pip3 install json5
  1. Use json5 instead of json:
import json5

print(json5.loads("""{
 "author": "John",
 "desc": "If it is import..",
 // "birthday": "nothing"
 }"""))
### OUTPUT: {'author': 'John', 'desc': 'If it is import..'}
ccpizza
  • 28,968
  • 18
  • 162
  • 169
  • though `json5` is a nice library, it is incredibly slow if compared to `json`. I would use `json5` purely for small configs loading, and leave data processing to `json` – AntonK Nov 17 '22 at 22:13
  • the `json5` spec is a superset of `json` so it has an extended grammar which can't be expected to be as performant and optimized as regular `json` so definitely must be avoided for something critical for production – ccpizza Nov 18 '22 at 19:56
0

You might look at Json5, if you're not really caring about strict by-the-book JSON formatting and just want something that allows you to have comments in JSON. For example, this library will let you parse JSON5: https://pypi.org/project/json5/

0

Here's a small standalone wrapper:

#!/usr/bin/env python3
import json
import re

def json_load_nocomments( filename_or_fp, comment = "//|#", **jsonloadskw ) -> "json dict":
    """ load json, skipping comment lines starting // or #
        or white space //, or white space #
    """
    # filename_or_fp -- lines -- filter out comments -- bigstring -- json.loads

    if hasattr( filename_or_fp, "readlines" ):  # open() or file-like
        lines = filename_or_fp.readlines()
    else:
        with open( filename_or_fp ) as fp:
            lines = fp.readlines()  # with \n
    iscomment = re.compile( r"\s*(" + comment + ")" ).match
    notcomment = lambda line: not iscomment( line )  # ifilterfalse
    bigstring = "".join( filter( notcomment, lines ))
        # json.load( fp ) does loads( fp.read() ), the whole file in memory

    return json.loads( bigstring, **jsonloadskw )


if __name__ == "__main__":  # sanity test
    import sys
    for jsonfile in sys.argv[1:] or ["test.json"]:
        print( "\n-- " + jsonfile )
        jsondict = json_load_nocomments( jsonfile )
            # first few keys, val type --
        for key, val in list( jsondict.items() )[:10]:
            n = (len(val) if isinstance( val, (dict, list, str) )
                else "" )
            print( "%-10s : %s %s" % (
                    key, type(val).__name__, n ))

denis
  • 21,378
  • 10
  • 65
  • 88