0

My IBM cloudant is connected to python. I need to get nested JSON data from cloudant to python environment. My aim is to take the necessary data and create a new dataframe.

Here it is nested JSON

   {
  "_id": "1",
  "topic": "",
  "payload": {
    "devices": [
      {
        "_id": "70",
        "last_date": 1499503900,
        "modules": [
          {
            "_id": "0",
            "last_seen": 1497151699,
            "dashboard_data": {
              "time_utc": 1497150981,
              "Temperature": 41.8,
              "Humidity": 24,
              "date_max_temp": 1497150519,
              "date_min_temp": 1497146892,
              "min_temp": 13.9,
              "max_temp": 45.6
            },
            "data_type": [
              "Temperature",
              "Humidity"
            ],
            "module_name": "X",
            "last_setup": 1487170865,
            "battery_vp": 3474,
            "battery_percent": 0,
            "rf_status": 79,
            "firmware": 44
          },
          {
            "_id": "03",
            "last_message": 1499503897,
            "last_seen": 1499503897,
            "dashboard_data": {
              "time_utc": 1499503795,
              "Temperature": 19.9,
              "temp_trend": "stable",
              "Humidity": 49,
              "CO2": 390,
              "date_max_temp": 1499468625,
              "date_min_temp": 1499490158,
              "min_temp": 19.4,
              "max_temp": 21.2
            },
            "data_type": [
              "Temperature",
              "CO2",
              "Humidity"
            ],
            "module_name": "Y",
            "last_setup": 1487173334,
            "battery_vp": 5484,
            "battery_percent": 71,
            "rf_status": 70,
            "firmware": 44
          },
          {
            "_id": "03",
            "last_message": 1499503897,
            "last_seen": 1499503852,
            "dashboard_data": {
              "time_utc": 1499503801,
              "Temperature": 19.7,
              "temp_trend": "stable",
              "Humidity": 51,
              "CO2": 414,
              "date_max_temp": 1499468631,
              "date_min_temp": 1499489242,
              "min_temp": 18.5,
              "max_temp": 20.7
            },
            "data_type": [
              "Temperature",
              "CO2",
              "Humidity"
            ],
            "module_name": "M",
            "last_setup": 1487173386,
            "battery_vp": 5518,
            "battery_percent": 73,
            "rf_status": 83,
            "firmware": 44
          },
          {
            "_id": "03",
            "last_message": 1499503897,
            "last_seen": 1499503872,
            "dashboard_data": {
              "time_utc": 1499503820,
              "Temperature": 22,
              "temp_trend": "stable",
              "Humidity": 49,
              "CO2": 419,
              "date_max_temp": 1499468599,
              "date_min_temp": 1499495002,
              "min_temp": 22,
              "max_temp": 22.9
            },
            "data_type": [
              "Temperature",
              "CO2",
              "Humidity"
            ],
            "module_name": "N",
            "last_setup": 1487173462,
            "battery_vp": 5494,
            "battery_percent": 72,
            "rf_status": 77,
            "firmware": 44
          }
        ],
        "place": {
          "country": "GB",
        },
        "type": "NAMain",
        "read_only": "true",
        "dashboard_data": {
          "AbsolutePressure": 1010.5,
          "Noise": 45,
          "Temperature": 20.5,
          "temp_trend": "stable",
          "Humidity": 52,
          "Pressure": 1010.5,
          "pressure_trend": "up",
          "CO2": 406,
          "date_max_temp": 1499468648,
          "date_min_temp": 1499500180,
          "min_temp": 20.4,
          "max_temp": 22.1
        },
        "data_type": [
          "Temperature",
          "CO2",
          "Humidity",
          "Noise",
          "Pressure"
        ],
        "co2_calibrating": "false",
        "date_setup": 1487170943,
        "last_setup": 1487170943,
        "module_name": "K",
        "firmware": 124,
        "last_upgrade": 1487170889,
        "wifi_status": 56
      }
    ]
  }
}

I have almost 50000 json data like this.

My aim is to creat a new pandas dataframe, which includes last_date in the beginning and Temperature for each _id.

I tried this:

import cloudant
import pandas as pd
import json

database2 = client['net']

for item in database2:
    print(item['payload']['devices'][0]['last_date'])

But it gives me

KeyError: 'payload'

and secondly this:

wanted = ['las_date', 'module_name']
for item in database2:
    if not 'las_date' in item:
        continue
    data = item['las_date']
    for key in wanted:
        if key in details:
            print(key, ':', json.dumps(details[key], indent=4))

But I got this:

HTTPError: 429 Client Error: Too Many Requests
amanb
  • 5,276
  • 3
  • 19
  • 38
emily.mi
  • 161
  • 12
  • you have the typo, try `last_date` instead of `las_date` maybe it will be solution – Brown Bear Apr 01 '18 at 10:44
  • I changed it. Still same output. – emily.mi Apr 01 '18 at 10:49
  • There is a problem with the json data above at `"id" :"03,`. I'm correcting that now. Also, `true` `false` values must be enclosed in " ". – amanb Apr 01 '18 at 10:54
  • Sorry for my copy paste mistake. Thank you for improvement – emily.mi Apr 01 '18 at 10:59
  • I've corrected the json data and added my answer. For the second error, change `las_date` to `last_date`. I could not test this as I do not have a cloudant database configured. – amanb Apr 01 '18 at 11:01

2 Answers2

0

After correcting the json data, I tried

item['payload']['devices'][0]['last_date']

and I get output as:

1499503900
amanb
  • 5,276
  • 3
  • 19
  • 38
  • It gives only first 100 data as you shared. And then "KeyError: 'payload' " appears below. I have almost 50000 JSON data like that. – emily.mi Apr 01 '18 at 11:05
  • In that case, you need a json package that can stream data iteratively. Take a look at [ijson](https://pypi.python.org/pypi/ijson/) or [json-streamer](https://github.com/kashifrazzaqui/json-streamer). This [answer](https://stackoverflow.com/questions/10382253/reading-rather-large-json-files-in-python) may be useful. – amanb Apr 01 '18 at 11:10
0

The error you see:

HTTPError: 429 Client Error: Too Many Requests

is a consequence of Cloudant being rate limited on IBM Cloud. The free version of Cloudant ("Lite") allows for 100 lookups, 10 writes and 5 queries per second after which you'll get the error above.

You can increase the limits (at a cost) from the Cloudant dashboard. The Cloudant Python library you use has the ability to retry-backoff on 429 errors which may relieve your issue. See

http://python-cloudant.readthedocs.io/en/latest/getting_started.html?highlight=429

xpqz
  • 3,617
  • 10
  • 16