2

I have a Python object method that uses the json module to write a collection of ordered dictionary objects as JSON strings to a file with UTF-8 encoding. Here is the code:

 def write_small_groups_JSON_file( self, file_dir, file_name ):

     with io.open( file_dir + file_name, 'w', encoding='utf-8' ) as file:
         JSON_obj = ''
         i = 0
         for ID in self.small_groups.keys():
           JSON_obj = json.dumps( self.small_groups[ID]._asdict(), ensure_ascii=False )
           file.write( unicode( JSON_obj ) )
           i += 1

     print str( i ) + ' small group JSON objects successfully written to the file ' + \
           file.name + '.'

Here small_groups is an ordered dictionary of a named tuple object called SmallGroup with a key ID which is a tuple of form (N,M) where N,M are positive integers, and if ID in small_groups.keys() then small_groups[ID]._asdict() is an ordered dictionary. Here is an example for ID=(36,1):

OrderedDict([('desc', ''), ('order', 36), ('GAP_ID', (36, 1)),
('GAP_desc', ''), ('GAP_pickled_ID', ''), ('is_abelian', None),
('is_cyclic', None), ('is_simple', None), ('is_nilpotent', None),
('nilpotency_class', None), ('is_solvable', None), ('derived_length',
None), ('is_pgroup', None), ('generators', None), ('char_degrees',
'[[1,4],[2,8]]'), ('D3', 68), ('MBS_STPP_param_type', (36, 1, 1)),
('Beta0', 36)])

The JSON output in the file looks squashed, no commas between the objects, and no opening and closing braces. It looks like this

{ object1 }{ object 2 }.....
...........{ object n }.....

Is this a valid JSON format, or do I have to separate the objects using commas?

Also, if I have a schema somewhere is there a way of validating the output against it?

2 Answers2

4

No, you no longer have a valid JSON; you wrote separate JSON objects (each valid) to a file without any delimiters.

You'd have to write your own delimiters, or produce one long list first and then write that out.

Creating a list is easy enough:

objects = []
for small_group in self.small_groups.values():
    objects.append(small_group._asdict()))
with io.open( file_dir + file_name, 'w', encoding='utf-8' ) as file:
    json_object = json.dumps(objects, ensure_ascii=False)
    file.write(unicode(json_object))

print '{} small group JSON objects successfully written to the file {}.'.format(
    len(objects), file.name)

This writes out JSON once, producing a JSON list containing multiple objects.

If you were to inject separators yourself, you'd have to start with writing [, then write a comma after each JSON object you produce except for the last one, where you'd write ] instead:

with io.open( file_dir + file_name, 'w', encoding='utf-8' ) as file:
    file.write(u'[')
    for count, small_group in enumerate(self.small_groups.values()):
        if count:  # not the first
            file.write(u',')
        json_object = json.dumps(small_group, ensure_ascii=False)
        file.write(unicode(json_object))
    file.write(u']')

print '{} small group JSON objects successfully written to the file {}.'.format(
    len(self.small_groups), file.name)

Once you have valid JSON you can validate that using a JSON schema validator. The obvious choice for Python would be the Python jsonschema library.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Thanks, I suppose you have to `unicode` the `'['`, `']'`, `','` symbols as well. I thought that JSON data files are enclosed in curly braces not square brackets. –  Feb 13 '15 at 14:24
  • @ramius: Use `u'['` unicode literals instead. JSON *objects* are enclosed by `{..}` and contain key-value pairs (with the keys always strings). `[...]` is a JSON *list*, an ordered collection of other JSON objects. – Martijn Pieters Feb 13 '15 at 14:26
  • One more question: JSON output does not preserve the ordering of keys inside the objects. The key ordering for a `small_group._asdict()` object is `['desc', 'order', 'GAP_ID', 'GAP_desc', 'GAP_pickled_ID', 'is_abelian', 'is_cyclic', 'is_simple', 'is_nilpotent', 'nilpotency_class', 'is_solvable', 'derived_length', 'is_pgroup', 'generators', 'char_degrees', 'D3', 'MBS_STPP_param_type', 'Beta0']`, but the JSON object output scrambles this. How to preserve the ordering? That's why I used an ordered dictionary in the first place!! –  Feb 13 '15 at 14:29
  • @ramius: like Python dictionaries, JSON objects are **unordered**. You'd have to write a sequence of `[key, value]` lists instead if order is important. – Martijn Pieters Feb 13 '15 at 14:31
  • 1
    @ramius: if you pass in the `OrderedDict` object itself, rather than use `._asdict()`, `json.dumps()` should write out JSON key-values in the same order as they are listed in the `OrderedDict` object (see [Items in JSON object are out of order using "json.dumps"?](http://stackoverflow.com/q/10844064)), but any code *reading* the JSON is free to ignore that order still. – Martijn Pieters Feb 13 '15 at 14:32
  • It's not important to have the order preserved, as long as I am able to recover the ordered dictionary object from the file. I haven't tried to use `json.loads` yet, but have to see how it decodes. –  Feb 13 '15 at 14:32
  • @ramius: `json.loads()` will produce a dictionary, again losing any order information. You can tell it to use an `OrderedDict` object too: [Can I get JSON to load into an OrderedDict in Python?](http://stackoverflow.com/q/6921699) – Martijn Pieters Feb 13 '15 at 14:33
  • Final comment: `small_group` is a named tuple `` not an ordered dict, but `small_group._asdict()` is an ordered dict. –  Feb 13 '15 at 14:36
  • @ramius: ah, right, incorrect assumption on my part. – Martijn Pieters Feb 13 '15 at 15:06
3

That is not valid JSON. You should not convert individual sub-elements to JSON and then concatenate them: you should build up a Python structure and then dump the whole lot at the end.

data = []
for key, value in self.small_groups.items():
    data.append(value._asdict())
with io.open( file_dir + file_name, 'w', encoding='utf-8' ) as f:
    f.write(json.dumps(data))
Daniel Roseman
  • 588,541
  • 66
  • 880
  • 895