0

I am working with json files that stores thousands or even more entries. firstly I want to understand the data I am working with.

import json


with open("/home/xu/stock_data/stock_market_data/nasdaq/json/AAL.json", "r") as f:
    data = json.load(f)
print(json.dumps(data, indent=4))

this gives me a easy to read format, but some of the "keys"(I am not familiar with the json name, so I use the word "key" as in dict objects) have thousands of values, which makes it hard to read as a whole.

I also tried:

import json


with open("/home/xu/stock_data/stock_market_data/nasdaq/json/AAL.json", "r") as f:
    data = json.load(f)

df = pd.DataFrame.from_dict(data, orient="index")
print (df.info)

but got

<bound method DataFrame.info of                                                   result error
chart  [{'meta': {'currency': 'USD', 'symbol': 'AAL',...  None>

this result kind of shows the structure, but it ends with ... not showcasing the whole picture.

My Question:

  1. Is there something that works like np.array.shape for json/dict/pandas, of which can show the shape of the structure?

  2. Is there a better library usage of interpretating the json file's structure?

Edit: Sorry perhaps my wording of my problem was misdirecting. I tried pprint, and it provided me with:

{ 'chart': { 'error': None,
             'result': [ { 'events': { 'dividends': { '1406813400': { 'amount': 0.1,
                                                                      'date': 1406813400},
                                                      '1414675800': { 'amount': 0.1,
                                                                      'date': 1414675800},
                                                      '1423146600': { 'amount': 0.1,
                                                                      'date': 1423146600},
                                                      '1430400600': { 'amount': 0.1,
                                                                      'date': 1430400600},
                                                      '1438867800': { 'amount': 0.1,
                                                                      'date': 1438867800},
                                                      '1446561000': { 'amount': 0.1,
                                                                      'date': 1446561000},
                                                      '1454941800': { 'amount': 0.1,
                                                                      'date': 1454941800},
                                                      '1462195800': { 'amount': 0.1,
                                                                      'date': 1462195800},
                                                      '1470231000': { 'amount': 0.1,
                                                                      'date': 1470231000},
                                                      '1478179800': { 'amount': 0.1,
                                                                      'date': 1478179800},
                                                      '1486650600': { 'amount': 0.1,
                                                                      'date': 1486650600},
                                                      '1494595800': { 'amount': 0.1,
                                                                      'date': 1494595800},
                                                      '1502371800': { 'amount': 0.1,
                                                                      'date': 1502371800},
                                                      '1510324200': { 'amount': 0.1,
                                                                      'date': 1510324200},
                                                      '1517841000': { 'amount': 0.1,
                                                                      'date': 1517841000},
                                                      '1525699800': { 'amount': 0.1,
                                                                      'date': 1525699800},
                                                      '1533562200': { 'amount': 0.1,
                                                                      'date': 1533562200},
                                                      '1541428200': { 'amount': 0.1,
                                                                      'date': 1541428200},
                                                      '1549377000': { 'amount': 0.1,
                                                                      'date': 1549377000},
                                                      '1557235800': { 'amount': 0.1,
                                                                      'date': 1557235800},
                                                      '1565098200': { 'amount': 0.1,
                                                                      'date': 1565098200},
                                                      '1572964200': { 'amount': 0.1,
                                                                      'date': 1572964200},
                                                      '1580826600': { 'amount': 0.1,
                                                                      'date': 1580826600}}},
                           'indicators': { 'adjclose': [ { 'adjclose': [ 18.19490623474121,
                                                                         19.326200485229492,
                                                                         19.05280113220215,
                                                                         19.80699920654297,
                                                                         20.268939971923828,
                                                                         20.891149520874023,
                                                                         20.928863525390625,
                                                                         21.28710174560547,
                                                                         20.88172149658203,
                                                                         20.93828773498535,
                                                                         20.721458435058594,
                                                                         20.514055252075195,
                                                                         20.466917037963867,
                                                                         20.994853973388672,
                                                                         20.81572914123535,
                                                                         20.2595157623291,
                                                                         20.155811309814453,
                                                                         19.816425323486328,
                                                                         20.702600479125977,
                                                                         21.032560348510742,
                                                                         20.740314483642578,
                                                                         21.0419864654541,
                                                                         21.26824951171875,
                                                                         22.531522750854492,
                                                                         23.266857147216797,
                                                                         23.587390899658203,
                                                                         25.9725284576416,
                                                                         26.27420997619629,
                                                                         27.150955200195312,
                                                                         27.273509979248047,
                                                                         27.7448787689209,
                                                                         29.507808685302734,
                                                                         30.92192840576172,
                                                                         31.4404239654541,
                                                                         31.817523956298828,
                                                                         31.940074920654297,
                                                                         31.676118850708008,
                                                                         32.354888916015625,
                                                                         31.157604217529297,
                                                                         30.158300399780273,
                                                                         30.63909339904785,
                                                                         31.148174285888672,
                                                                         30.969064712524414,
                                                                         31.496990203857422,
                                                                         31.01619529724121,
                                                                         31.666685104370117,
                                                                         32.31717300415039,
                                                                         32.31717300415039,
                                                                         30.497684478759766,
                                                                         31.69496726989746,
                                                                         32.006072998046875,
                                                                         31.7326717376709,
                                                                         31.940074920654297,
                                                                         31.826950073242188,
                                                                         31.346155166625977,
                                                                         31.61954689025879,
                                                                         ...
                                                                         ...
                                                                         ...
#this goes on and on for the respective "keys" of the json file. which means I have to scroll down thousands of lines to find out what type of data I have.

what I am hoping to find a a solutions that outputs something like this, where it doesn't show the data itself in whole, but only shows the "keys" and maybe some additional information. as some files may literally contain many GBs of data, making it impractical to scroll through.

#this is what I am hoping to achieve.
{
    "Name": {
        "title": <datatype=str,len=20>,
        "time_stamp":<data_type=list, len=3000>,
        "closing_price":<data_type=list, len=3000>,
        "high_price_of_the_day":<data_type=list, len=3000>
        ...
        ...
        ...
            }
}
  • Would displaying a tree as follows be helpful? https://stackoverflow.com/questions/55926688/python-create-tree-from-a-json-file – Jason Chia Nov 05 '21 at 09:20
  • @JasonChia that may be potentially workable solution, but not out of the box, as I want to hide some of the "branches" to make visualizing the data easier. I will try to build on top of that suggestion, and come back if it works. –  Nov 05 '21 at 11:58
  • Looks like you want to print out the keys as well as the types of values. You can do a simple recursive function for that with some additional 'ignore' rules for keys that you do not care about. Essentially, parsing a nested dict such that you get all keys and the final type(value) of each possible key. – Jason Chia Nov 05 '21 at 12:49

1 Answers1

-1

You have a few options on how to navigate this. If you want to render your data to make more informed decisions quickly, there are the built-in libraries for rendering dictionaries (see pprint) but on a personal level I recommend something that works out of the box without much configuration. I found pprintpp to be the ideal choice for any python data structure. https://pypi.org/project/pprintpp/

Simply run in your terminal: pip3 install pprintpp The libraries should install under C:\Users\User\AppData\Local\Programs\Python\PythonXX\Lib\site-packages\pprintpp

After that, simply do this in your code:

import json
from pprintpp import pprint

with open("/home/xu/stock_data/stock_market_data/nasdaq/json/AAL.json", "r") as f:
    data = json.load(f)
pprint(data)

You can also do pprint(data, width=1) to guarantee next dictionary key goes on the next line, even if the key is short. Ie:

some_dict = {'a': 'b', 'c': {'aa': 'bb'}}
pprint(data, width=1)

Outputs:

{
    'a': 'b',
    'c': {
        'aa': 'bb',
    },
}

Hope this helped! Cheers :)

  • 3
    Thank you for the```pprint```option, but I am looking for a way to display the structure of the data in a readable fashion, rather than the data itself. –  Nov 05 '21 at 11:55