9

This JSON output is from a MongoDB aggregate query. I essentially need to parse the nested data JSON down to the following to the 'total' and '_id' values.

{
'ok': 1.0, 
'result': [
            {
                'total': 142250.0, 
                '_id': 'BC'
            }, 
            {
                'total': 210.88999999999996,
                 '_id': 'USD'
            }, 

            {
                'total': 1065600.0, 
                '_id': 'TK'
            }
            ]
}

I've tried 5 different techniques to get what I need from it, however I've run into issues using the json and simplejson modules.

Ideally, the output will be something like this:

142250.0, BC
210.88999999999996, USD
1065600.0, TK
martineau
  • 119,623
  • 25
  • 170
  • 301
unique_beast
  • 1,379
  • 2
  • 11
  • 23
  • Can you post the code from your attempts with the two modules? – ely Nov 01 '13 at 15:25
  • As mentioned in cpburnz's answer: the problem is the single vs. double quote characters. You need double-quotes, and then a simple call to `json.loads` will work (your JSON string loads fine for me after that switch and is easy to parse). – ely Nov 01 '13 at 15:42

5 Answers5

15

NOTE: Your JSON response from MongoDB is not actually valid. JSON requires double-quotes ("), not single-quotes (').

I'm not sure why your response has single-quotes instead of double-quotes but from the looks of it you can replace them and then just use the built-in json module:

from __future__ import print_function
import json

response = """{
    'ok': 1.0, 
    'result': [
        {
            'total': 142250.0, 
            '_id': 'BC'
        }, 
        {
            'total': 210.88999999999996,
             '_id': 'USD'
        }, 

        {
            'total': 1065600.0, 
            '_id': 'TK'
        }
        ]
}"""

# JSON requires double-quotes, not single-quotes.
response = response.replace("'", '"')
response = json.loads(response)
for doc in response['result']:
    print(doc['_id'], doc['total'])
Uyghur Lives Matter
  • 18,820
  • 42
  • 108
  • 144
  • 1
    I think you mean `json` requires double-quotes in the comment line. – ely Nov 01 '13 at 15:34
  • 1
    Yes, I wrote that backwards. Fixed. – Uyghur Lives Matter Nov 01 '13 at 15:35
  • I didn't realize you were making edits. I noticed that some things disappeared a couple times after I fixed various typos so I re-edited to put back what disappeared (probably due to conflicting edits?) – Uyghur Lives Matter Nov 01 '13 at 15:46
  • don't use an unconditional replace on a string with a structured data (a literal for Python dictionary in this case). It might work for some time until it *silently* corrupts the data within. Use an appropriate parser instead e.g., [`ast.literal_eval()`](http://stackoverflow.com/a/19730573/4279) – jfs Nov 01 '13 at 16:09
  • @J.F. Fair enough, `ast.iteral_eval()` would be safer solution (really getting a proper response from MongoDB would be best). Given the data which only contains currency code strings and numeric values, a search and replace is sufficient. This would incorrectly convert an embedded `\'` into a `\"` (e.g., `that\'s` would become `that\"s`. Being pedantic, if the response contained a `Date` or `ObjectId`, `literal_eval()` would fail. But still that's probably better if they're not needed. – Uyghur Lives Matter Nov 01 '13 at 16:38
  • Getting this response from trying the above: >>> for doc in response['result']: print doc['_id'], doc['total'] SyntaxError: invalid syntax – unique_beast Nov 01 '13 at 18:14
  • @Andrew In python 3 print is a function instead of a statement. It's fixed to support both now. – Uyghur Lives Matter Nov 01 '13 at 23:32
0

The response you are getting from the mongodb seems to be the compatible to put for the dictionary type object. as

{
    'ok': 1.0,  'result': [
        {
            'total': 142250.0, 
            '_id': 'BC'
        }, 
        {
            'total': 210.88999999999996,
             '_id': 'USD'
        }, 
        {
            'total': 1065600.0, 
            '_id': 'TK'
        }
    ]
}

Instead of putting it into multiline string and replacing single quotes in double quotes, can't we directly assign it to the dict type object. and perform further operation on it like:

json_data = {
    'ok': 1.0,
    'result':
        [
            {
                'total': 142250.0,
                '_id': 'BC'
            },
            {
                'total': 210.88999999999996,
                '_id': 'USD'
            },
            {
                'total': 1065600.0,
                '_id': 'TK'
            }
    ]
}

And:

for data in json_data['result']:
    print(data['total'], data['_id'])
sɐunıɔןɐqɐp
  • 3,332
  • 15
  • 36
  • 40
-1
import json

data = json.loads(mongo_db_json)
result = data['result']
for value_dict in result:
    print '{0}, {1}'.format(value['total'], value['_id'])

This should work

elssar
  • 5,651
  • 7
  • 46
  • 71
  • Actually the OP said that he tried 5 different ways, didn't mention any specific method. But fair enough, couldn't have been such a simple thing. – elssar Nov 01 '13 at 16:12
  • Yeah, this didn't work for the reason mentioned in the first answer. PyMongo is only giving singular quotes in its output. – unique_beast Nov 01 '13 at 17:49
-1

Your example text is not valid JSON text. JSON string must start with a " quotation mark, not '; but it seems a valid Python literal that you can parse with ast.literal_eval() function:

import ast

data = ast.literal_eval(input_string)
for item in data["result"]:
    print("{total}, {_id}".format(**item))

Output

142250.0, BC
210.89, USD
1065600.0, TK

A better way might be to fix the querying process to get valid JSON and use json module to parse it.

jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • All of the values are singular quote responses. – unique_beast Nov 01 '13 at 18:17
  • >>> data = ast.literal_eval(str(response)) >>> for item in data["result"]: print("{total}, {_id}".format(**item)) – unique_beast Nov 01 '13 at 18:17
  • That will have to do for now, though I would like to actually get JSON out of my Mongo instance... – unique_beast Nov 01 '13 at 18:18
  • 1
    @EMS: I don't see json data in the question. Therefore I don't use json parser to parse it. I've edited the answer to mention explicitly that the input should be changed to json – jfs Nov 01 '13 at 23:03
  • @EMS: if input is JSON then `json` module can be used. End of story. But the input is not JSON. if OP can't/won't fix the input that looks like a Python literal then `ast.literal_eval` is a safer alternative than a blind `.replace` as [I've commented already](http://stackoverflow.com/questions/19729710/parsing-nested-json-using-python/19730573?noredirect=1#comment29314372_19729976). – jfs Nov 03 '13 at 06:57
  • @EMS: ok, I've got your point: you think that actual data presented in the question being non-JSON is a non-issue that deserves at most a comment and could be fixed by such sloppy methods as global `.replace` on the data. – jfs Nov 05 '13 at 13:28
  • @EMS: I agree, that is why I wrote: "if OP **can't/won't** fix the input that looks like a Python literal then ast.literal_eval is a safer alternative than a blind .replace". – jfs Nov 05 '13 at 14:32
  • @EMS: and here we disagree. I consider traversing a dictionary to be a trivial matter but a more robust handling of the input data to be essential. – jfs Nov 05 '13 at 14:37
-2

This should do.

import json

def parse_json(your_json):
    to_dict = json.loads(your_json)
    for item in to_dict['results']:
        print item['total']
shshank
  • 2,571
  • 1
  • 18
  • 27