-1

dealing with a nasty bit of JSON. I am using json.load to write into a file and have it stored is a dict type , printed below. In python, how would I go about getting a list of just the "dimension" values starting after ""false_value"" (as they first dimension value is not actually a value I want).

I tried kind of a hacky way, but feel like someone may have a perspective on how to do this in a more eloquent fashion.

Goal, make list of all the dimension values (outside the first) such as ( '100', '121' ...)

{
    "reports": [
        {
            "columnHeader": {
                "dimensions": [
                    "ga:clientId"
                ],
                "metricHeader": {
                    "metricHeaderEntries": [
                        {
                            "name": "blah",
                            "type": "INTEGER"
                        }
                    ]
                }
            },
            "data": {
                "rows": [
                    {
                        "dimensions": [
                            "false_value"
                        ],
                        "metrics": [
                            {
                                "values": [
                                    "2"
                                ]
                            }
                        ]
                    },
    {
                        "dimensions": [
                            "100"
                        ],
                        "metrics": [
                            {
                                "values": [
                                    "2"
                                ]
                            }
                        ]
                    },
                    {
                        "dimensions": [
                            "121"
                        ],
                        "metrics": [
                            {
                                "values": [
                                    "1"
                                ]
                            }
                        ]
                    },
                    {
                        "dimensions": [
                            "1212"
                        ],
                        "metrics": [
                            {
                                "values": [
                                    "1"
                                ]
                            }
                        ]
                    }, ],
                "totals": [
                    {
                        "values": [
                            "10497"
                        ]
                    }
                ],
                "rowCount": 9028,
                "minimums": [
                    {
                        "values": [
                            "0"
                        ]
                    }
                ],
                "maximums": [
                    {
                        "values": [
                            "9"
                        ]
                    }
                ],
                "isDataGolden": true
            },
            "nextPageToken": "1000"
        }
    ]
}
0004
  • 1,156
  • 1
  • 14
  • 49
  • 1
    I believe you can iterate through all the keys/subkeys/values and dump them into a list. See here: https://stackoverflow.com/questions/45974937 – Matt Cottrill Feb 12 '21 at 02:20
  • Did you mean True rather than true (i.e. **True/False** for Python Boolean)? – DarrylG Feb 12 '21 at 02:34
  • @DarrylG this is JSON, not Python code. – Danny Varod Feb 12 '21 at 02:37
  • A simple recursive function that uses `isinstance()` in conditions should do the trick. – Danny Varod Feb 12 '21 at 02:39
  • @DannyVarod--when OP says "it stored is a dict type , printed below" I assume OP was displaying as a dictionary. If we consider it as a string, then json.loads(...) gives structural errors (which is also elicited by a json lint validator for the string). If you change the true to True, then it works as a dictionary. – DarrylG Feb 12 '21 at 02:42

3 Answers3

2

First, you should put your json object in a better textual readable form. Use something like Black to clean up the spaces. Then just transverse the keys till you find your required value, this post will help you.

You should end up with something like this:

dimensions = [row["dimensions"][0] for row in json["reports"][0]["data"]["rows"]]
Leo103
  • 691
  • 7
  • 16
  • very cool and neat , ty so much. Any tips on how to ignore the first value? – 0004 Feb 12 '21 at 04:51
  • Check this post: [How to remove the first item in a list](https://stackoverflow.com/questions/4426663/how-to-remove-the-first-item-from-a-list) – Leo103 Feb 12 '21 at 04:57
0

Using recursive function to find values with two conditions

  • Parent key was dimensions
  • Take only the numeric values

Code

def find_dims(d, inside = False, results = None):
    '''
        Recursive processing of structure
        inside  = True when parent was "dimensions"
    '''
    if results is None:
        results = []
        
    if isinstance(d, dict):
        for k, v in d.items():
            find_dims(v, k=="dimensions" or inside, results)
    elif isinstance(d, list):
        for k in d:
            find_dims(k, inside, results)
    else:
        if inside and d.isdigit():
            # inside dimensions with a number
            results.append(int(d))
            
    return results

Test

OP Dictinary (changed true to True)

d = {
    "reports": [
        {
            "columnHeader": {
                "dimensions": [
                    "ga:clientId"
                ],
                "metricHeader": {
                    "metricHeaderEntries": [
                        {
                            "name": "blah",
                            "type": "INTEGER"
                        }
                    ]
                }
            },
            "data": {
                "rows": [
                    {
                        "dimensions": [
                            "false_value"
                        ],
                        "metrics": [
                            {
                                "values": [
                                    "2"
                                ]
                            }
                        ]
                    },
    {
                        "dimensions": [
                            "100"
                        ],
                        "metrics": [
                            {
                                "values": [
                                    "2"
                                ]
                            }
                        ]
                    },
                    {
                        "dimensions": [
                            "121"
                        ],
                        "metrics": [
                            {
                                "values": [
                                    "1"
                                ]
                            }
                        ]
                    },
                    {
                        "dimensions": [
                            "1212"
                        ],
                        "metrics": [
                            {
                                "values": [
                                    "1"
                                ]
                            }
                        ]
                    }, ],
                "totals": [
                    {
                        "values": [
                            "10497"
                        ]
                    }
                ],
                "rowCount": 9028,
                "minimums": [
                    {
                        "values": [
                            "0"
                        ]
                    }
                ],
                "maximums": [
                    {
                        "values": [
                            "9"
                        ]
                    }
                ],
                "isDataGolden": True
            },
            "nextPageToken": "1000"
        }
    ]
}

print(find_dims(d)) # Output: [100, 121, 1212]
DarrylG
  • 16,732
  • 2
  • 17
  • 23
0

Like stated in the comments u can just use a simple recursive function, for example:

all_dimensions = []
search_key = 'dimensions'
def searchDimensions(data):
    if isinstance(data, dict):
        for (key, sub_data) in data.items():
            if key == search_key: all_dimensions.extend(sub_data)
            else: all_dimensions.extend(searchDimensions(sub_data))

    elif isinstance(data, list):
        for sub_data in data:
            all_dimensions.extend(searchDimensions(sub_data))

    return []

searchDimensions(example)
false_value_index = all_dimensions.index('false_value') + 1
output = all_dimensions[false_value_index:]
print(output)
>>> ['100', '121', '1212']

And then filter the values that u don't want (eg. starting from false_value)

marcos
  • 4,473
  • 1
  • 10
  • 24