Pythonic way to transform/flatten JSON containing nested table-as-list-of-dicts structures

Question

Suppose I have a table represented in JSON as a list of dicts, where the keys of each item are the same:

J = [
    {
        "symbol": "ETHBTC",
        "name": "Ethereum",
        :
    },
    {
        "symbol": "LTC",
        "name": "LiteCoin"
        :
    },

And suppose I require efficient lookup, e.g. symbols['ETHBTC']['name']

I can transform with symbols = { item['name']: item for item in J }, producing:

{
    "ETHBTC": {
        "symbol": "ETHBTC",
        "name": "Ethereum",
        :
    },
    "LTCBTC": {
        "symbol": "LTCBTC",
        "name": "LiteCoin",
        :
    },

(Ideally I would also remove the now redundant symbol field).

However, what if each item itself contains a "table-as-list-of-dicts"?

Here's a fuller minimal example (I've removed lines not pertinent to the problem):

J = {
    "symbols": [
        {
            "symbol":"ETHBTC",
            "filters":[
                {
                    "filterType":"PRICE_FILTER",
                    "minPrice":"0.00000100",
                },
                {
                    "filterType":"PERCENT_PRICE",
                    "multiplierUp":"5",
                },
            ],
        },
        {
            "symbol":"LTCBTC",
            "filters":[
                {
                    "filterType":"PRICE_FILTER",
                    "minPrice":"0.00000100",
                },
                {
                    "filterType":"PERCENT_PRICE",
                    "multiplierUp":"5",
                },
            ],
        }
    ]
}

So the challenge is to transform this structure into:

J = {
    "symbols": {
        "ETHBTC": {
            "filters": {
                "PRICE_FILTER": {
                    "minPrice": "0.00000100",
    :
}

I can write a flatten function:

def flatten(L:list, key) -> dict:
    def remove_key_from(D):
        del D[key]
        return D
    return { D[key]: remove_key_from(D)  for D in L }

Then I can flatten the outer list and loop through each key/val in the resulting dict, flattening val['filters']:

J['symbols'] = flatten(J['symbols'], key="symbol")
for symbol, D in J['symbols'].items():
    D['filters'] = flatten(D['filters'], key="filterType")

Is it possible to improve upon this using glom (or otherwise)?

Initial transform has no performance constraint, but I require efficient lookup.

score 1 · Answer 1 · answered Oct 28 '21 at 07:21

I don't know if you'd call it pythonic but you could make your function more generic using recursion and dropping key as argument. Since you already suppose that your lists contain dictionaries you could benefit from python dynamic typing by taking any kind of input:

from pprint import pprint
def flatten_rec(I) -> dict:
    if isinstance(I, dict):
        I = {k: flatten_rec(v) for k,v in I.items()}
    elif isinstance(I, list):
        I = { list(D.values())[0]: {k:flatten_rec(v) for k,v in list(D.items())[1:]} for D in I }
    return I

pprint(flatten_rec(J))

Output:

{'symbols': {'ETHBTC': {'filters': {'PERCENT_PRICE': {'multiplierUp': '5'},
                                    'PRICE_FILTER': {'minPrice': '0.00000100'}}},
             'LTCBTC': {'filters': {'PERCENT_PRICE': {'multiplierUp': '5'},
                                    'PRICE_FILTER': {'minPrice': '0.00000100'}}}}}

score 1 · Answer 2 · answered Oct 28 '21 at 15:54

Since you have different transformation rules for different keys, you can keep a list of the key names that require "grouping" on:

t = ['symbol', 'filterType']
def transform(d):
   if (m:={a:b for a, b in d.items() if a in t}):
      return {[*m.values()][0]:transform({a:b for a, b in d.items() if a not in m})}
   return {a:b if not isinstance(b, list) else {x:y for j in b for x, y in transform(j).items()} for a, b in d.items()}

import json
print(json.dumps(transform(J), indent=4))

{
    "symbols": {
        "ETHBTC": {
            "filters": {
                "PRICE_FILTER": {
                    "minPrice": "0.00000100"
                },
                "PERCENT_PRICE": {
                    "multiplierUp": "5"
                }
            }
        },
        "LTCBTC": {
            "filters": {
                "PRICE_FILTER": {
                    "minPrice": "0.00000100"
                },
                "PERCENT_PRICE": {
                    "multiplierUp": "5"
                }
            }
        }
    }
}

Pythonic way to transform/flatten JSON containing nested table-as-list-of-dicts structures

2 Answers2